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ABSTRACT 

In  order  to  provide  the  capability  for  submarines  to  communicate 
through  a  satellite  while  remaining  submerged  and  traveling  at 
operational  speeds  a  towed  buoyant  cable  array  antenna  is  being 
developed.  The  array  is  adaptive  from  the  point  of  view  that  the 
direction  of  the  satellite  need  not  be  known,  the  position  and 
orientation  of  the  array  need  not  be  know,  and  the  shape  of  the 
flexible  array  need  not  be  known.  A  blind  equalization 
procedure  is  used  to  estimate  the  signal  space  from  the  downlink 
signal  and  create  a  spatial  matched  filter  for  receive.  While  the 
frequency  division  satellite  system  is  intended  to  allow  only  one 
signal  per  frequency  slot,  the  system  can  also  operate  in  the 
presence  of  jamming  by  separating  multiple  sources  spatially. 

Once  the  downlink  receive  antenna  weights  have  been  obtained, 
the  more  difficult  task  of  obtaining  uplink  weights  at  a  separated 
frequency  must  be  performed.  Since  no  data  is  available  for 
blind  equalization  at  the  transmit  frequency  a  frequency 
extrapolation  method  is  used  to  extend  the  downlink  receive 
weights  to  frequencies  beyond  where  the  equalization  data  was 
collected.  This  extrapolation  is  complicated  by  2-pi  ambiguities 
of  the  measured  phases  as  well  as  amplification  of  measurement 
errors  in  the  extrapolation  process.  An  algorithm  has  been 
developed  that  performs  well 

1.  INTRODUCTION 

An  adaptive  arrav  antenna  is  being  developed  to  provide  the 
capability  for  submarines  to  communicate  through  a  satellite 
while  remaining  submerged  and  traveling  at  operational 
speeds[  1).  The  antenna  consists  of  12  elements  in  a  linear 
floating  hose  that  is  attached  at  the  end  of  a  long  tow  cable.  The 
flexible  antenna  ndes  the  waves  so  the  instantaneous  shape  and 
orientation  of  the  array  are  unknown.  The  multiple  elements 
provide  margin  to  the  communication  link  budget  through  the 
increased  gain  compared  to  a  single  element  system,  but  the 
principle  advantage  of  the  multiple  elements  is  the  element 
diversity  which  provides  resistance  to  element  wash-over 
effects[2].  When  one  or  several  elements  are  washed  over  and 
incapable  of  receiving  or  transmitting  the  remainder  of  the  array 
will  carry  the  load  with  only  a  small  fade  rather  than  the  deep 
dropout  that  would  be  experienced  by  a  single  element  antenna. 


In  order  to  provide  the  antenna  gain  and  stability  from  the 
multiple  elements,  the  downlink,  i.e.,  receive,  signals  are 
combined  in  an  RF  beamformer  after  being  phase  shifted  with 
digitally  controlled  analog  phase  shifter.  The  only  amplitude 
control  is  1-bit,  i.e.  the  channel  can  be  turned  on  or  off.  This 
on/off  switch  allows  an  element  to  be  turned  off  when  its 
adaptive  weight  is  very  small  and  the  use  of  the  element  at  full 
gain  would  add  more  noise  than  signal.  This  results  in  only  a 
modest  loss  relative  to  implementation  of  the  actual  optimum 
amplitude  weight.  The  beamforming  is  done  in  the  analog 
domain  so  that  the  whole  adaptive  array  system  can  be  used  as  a 
drop-in  antenna  for  existing  communication  systems  without 
impacting  other  aspects  of  the  communications  hardware  or 
methods  of  operation.  The  weight  determination  for  receive 
beamforming  is  based  on  sampling  the  downlink  signal  on  the 
multiple  elements  to  determine  their  phases  and  is  covered  in 
section  2. 

When  a  signal  is  to  be  sent  up  to  the  satellite  a  different 
frequency  is  used,  differing  from  the  downlink  frequency  by 
about  15%.  Since  there  is  no  signal  from  the  satellite  at  the 
uplink  frequency  to  sample,  an  alternative  method  of  determining 
the  transmit  weights  must  be  used.  The  method  of  transmit  phase 
determination  is  based  on  calculating  receive  weight  phases  on 
two  different  downlink  frequencies  that  are  separated.  The  phase 
on  each  of  the  elements  is  then  extrapolated  linearly  to  the 
frequency  where  the  transmit  weights  are  to  be  used.  Since  the 
satellite  is  constantly  transmitting  on  all  downlink  frequencies 
there  is  no  problem  finding  two  sets  of  separated  adaptive 
weights  for  the  extrapolation  process.  Transmit  beamforming 
with  frequency  extrapolation  is  covered  in  section  3. 

2.  RECEIVE  BEAMFORMING 

2.1  Algorithm  and  Estimation  Accuracy 

The  main  goal  of  making  the  system  adaptive  is  to  phase  shift  the 
element  signals  so  that  their  phasors  are  aligned  tip-to-tail  to 
create  the  largest  possible  resultant  and  therefore  high  gain.  This 
is  accomplished  by  receiving  and  sampling  the  downlink  signal 
from  the  satellite  on  each  of  the  elements  and  creating  a  sample 
covariance  matrix.  The  satellite  system  that  this  antenna  system 
works  with  is  frequency  channelized  so  there  should  be  only  one 
signal,  i.e.,  the  signal  of  interest,  passed  through  the  receivers 
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and  into  the  sampled  covariance  matrix.  There  should  only  be  a 
single  large  eigenvalue  of  the  matrix  corresponding  to  the  signal 
of  interest  and  the  eigenvector  associated  with  the  largest 
eigenvalue  corresponds  to  the  spatial  mode  of  excitation  of  that 
signal  on  the  array.  The  conjugate  of  this  primary  eigenvector 
then  represents  the  spatial  matched  filter  to  best  receive  the 
downlink  signal.  At  this  point  any  element  whose  adaptive 
weight  is  down  from  the  nominal  level  by  more  than  6  dB  is 
turned  off.  The  actual  level  where  a  reduced-signal  element  does 
more  harm  than  good  in  the  beamforming  is  a  function  of  the 
number  of  elements  in  the  system,  but  converges  to  -6  dB  for 
large  numbers  of  elements  in  the  array.  This  adaptive  process  is 
repeated  at  a  rate  that  is  faster  than  any  changes  in  the  ocean- 
antenna  environment  and  is  on  the  order  of  10-100  Hz. 

The  accuracy  of  the  adaptive  phase  determination  is  not  as 
critical  in  this  type  of  beamforming  compared  to  nulling  systems 
since  a  small  misalignment  of  phasers  all  in  a  line  will  still  have  a 
large  resultant.  An  analysis  of  Gaussian  distributed  phase  errors 
on  the  element  channels  leads  to  an  expected  loss  (or  relative 
gain)  in  the  beamformer  of 


where  *  is  the  RMS  phase  error  level  in  radians.  This 
beamforming  loss  is  independent  of  the  number  of  elements.  The 
spread  about  the  expected  value,  however,  does  depend  on  the 
number  of  elements  in  the  system.  This  loss  equation  is  plotted 
in  Fig.  1  as  the  solid  line  along  with  simulation  results  for  a  10- 
element  system,  where  10  runs  were  made  at  each  0.1  radian 
phase  error  level.  The  spread  about  the  expected  value  is  larger 
for  large  phase  errors  and  for  small  numbers  of  elements. 


Figure  1.  RMS  phase  errors  effect  the  relative  gain  of  the 
beamformer.  The  theoretical  loss  is  shown  along  with 
simulation  results  for  a  10-element  system  as  a  function 
of  the  level  of  the  Gaussian  distributed  errors. 

The  number  of  samples,  k,  used  in  forming  the  sample  covariance 
matrix  and  the  signal-to-noise  ratio,  SNR,  on  the  elements  effect 
both  the  phase  error  level  and  the  amplitude  error  level.  These 
effects  can  be  incorporated  into  the  relative  gain  expression 
resulting  in  a  simple  approximate  expression  for  loss  as  a 
function  of  k  and  SNR: 
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G  =  e  k‘SNR. 

A  plot  of  this  equation  is  shown  in  Fig.  2  as  the  solid  line  for  a 
k-10  sample  case.  A  series  of  simulations  were  run,  one  at  each 
integer  dB  level  of  SNR ,  and  Fig.  2  shows  the  inner  product  of 
the  signal  vector  with  each  of  the  eigenvectors  as  circles  in  the 
plot.  It  can  be  seen  that  the  primary  eigenvector  is  a  very  good 
estimate  of  the  signal  vector,  especially  at  higher  SNR  levels. 


Figure  2.  Lines  show  theory  and  circles  show 
simulations  of  inner  product  of  signal  and  eigenvectors. 
The  simulation  was  run  at  each  integer  dB  level. 


The  other  noise  eigenvectors  will  be  orthogonal  to  the  primary 
eigenvector  and  approximately  orthogonal  to  the  signal  vector. 
One  measure  of  the  quality  of  the  signal  estimate  that  is  more 
readily  seen  in  the  plot  of  Fig.  2  is  the  level  of  the  inner  product 
of  the  signal  with  the  noise  eigenvectors.  High  SNR  leads  to  a 
high  degree  of  orthogonality.  A  theoretical  expression  for  these 
noise  inner  products  levels,  NILs,  can  be  given  by 
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where  N  is  the  number  of  elements  in  the  system.  The  equation 
points  out  that  NIL  is  not  nil,  although  it  is  close.  These 
approximate  expressions  show  that  SNR  and  k  can  be  traded  off 
against  one  another  as  needed  in  the  system  design,  however  the 
accuracies  of  the  expressions  are  worse  at  very  low  SNR  values 
and  care  should  be  used  in  the  tradeoff. 


2.2  Experimental  Results 

A  seven  element  experimental  array  was  built  to  prove  the 
beamforming  concept.  A  strong  CW  signal  transmitted  from  a 
helicopter  passing  the  170  foot  towed  array  was  used  as  a 
substitute  for  the  satellite  downlink  signal.  The  power  received 
from  each  of  the  seven  elements  is  plotted  in  Fig.  3  and  the 
dropouts  of  individual  channels  from  ocean  washover  is  evident. 
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The  overall  change  in  power  levels  over  the  40  second  run  is 
caused  by  the  changing  range  of  the  source  on  the  helicopter  as  it 
flew  past  from  stem  to  bow.  The  output  of  the  RF  beamformer 
was  also  received  and  sampled  and  is  show  as  the  top  curve  of 
Fig.  3.  It  is  clear  that  the  seven  elements  provided  both  gain  and 
signal  stability.  Dropouts  on  individual  channels  show  up  in  the 
beamformed  signal  as  small  dips  on  a  signal  that  has  much  higher 
SNR  due  to  the  array  gain. 


Time  (s) 


Figure  3.  The  element  power  vs.  time  is  plotted  for  a 
helicopter  fly-by  experiment.  The  beamformer  output  is 
the  top  curve. 

The  individual  elements  are  spaced  far  enough  apart  so  that  the 
waves  impact  them  in  an  independent  fashion.  If  the  elements 
are  too  close,  a  washout  on  one  element  would  be  highly 
correlated  with  washouts  on  the  neighboring  elements  and  many 
more  elements  would  be  need  in  the  array  to  reach  the 
performance  achieved  with  the  widely  separated  element.  Since 
there  is  a  desire  to  make  the  array  as  short  as  possible  with  few 
element  for  simplicity,  an  interesting  tradeoff  arises  for  the  array 
design  of  the  spacing  and  number  of  elements  required  to  meet 
the  performance  goals  [3]  [4]. 

2*3  Weight  Smoothing 

In  addition  to  providing  gain  and  signal  level  stability,  it  is  also 
necessary  to  be  sure  that  the  phase  at  the  output  of  the 
beamformer  is  stable  so  that  the  signal  modulation  is  not 
corrupted.  While  this  is  not  a  beamforming  task  in  itself,  it  is 
necessary  for  any  modulation  system  that  depends  on  the  signal 
phase.  The  beamformer  output  phase  is  arbitrarily  set  at  every 
weight  update  cycle  by  the  adaptive  weights  which  are  only 
determined  within  a  random  phase  factor.  Both  the  old  weights 
and  the  new  will  provide  good  beamforming  at  the  transition 
time,  but  there  will  be  a  phase  jump  at  the  beamformer  output 
unless  corrective  action  is  taken.  In  this  adaptive  system  a 
correction  phase  is  determined  for  the  new  weights  by  applying 
both  sets  of  weights  digitally  to  the  new  block  of  data  and 
calculating  the  average  phase  shift  between  the  output  of  the  old 
weights  relative  to  the  new  weights.  This  measured  phase  offset 


is  then  used  as  an  adjustment  on  the  new  weights  before  they  are 
applied  to  the  RF  beamformer. 

2.4  Interference 

While  it  is  intended  that  the  system  operate  with  only  one  signal 
present  during  the  estimation  of  the  covariance  matrix,  it  is 
possible  that  a  jammer  from  a  direction  other  than  the  direction 
of  the  satellite  would  try  to  confuse  the  adaptive  process.  If  a 
large  jammer  signal  is  present  along  with  the  smaller  desired 
signal,  the  covariance  matrix  will  have  two  large  eigenvalues 
rather  than  one.  The  two  eigenvectors  associated  with  the  two 
large  eigenvalues  span  the  signal  space  of  the  two  signal  vectors. 
To  a  large  extent  a  one-to-one  association  can  be  made  between 
each  eigenvector  and  one  of  the  signals  if  the  power  of  the  two 
signals  is  different,  although  strictly  speaking  each  eigenvector 
will  have  a  portion  of  each  signal  if  the  signals  are  not  spatially 
orthogonal.  With  a  larger  jammer,  the  signal  vector  can  be 
estimated  by  the  second  eigenvalue  and  eigenvector.  If  the 
conjugated  of  this  second  eigenvector  is  used  as  the  array  weight 
vector  then  the  desired  signal  will  be  well  received  and  the 
jammer  will  be  well  rejected  since  the  eigenvectors  are 
orthogonal  to  each  other. 

The  degree  of  separation  of  the  signals  into  separate  eigenvectors 
can  be  seen  by  looking  at  the  inner  product  of  the  second 
eigenvector  with  each  of  the  two  signal  vectors.  The  difference 
between  the  two  represents  the  null  depth  that  is  achievable.  The 
degree  of  separation,  i.e.,  the  null  depth,  for  spatially  separated 
signals  can  be  represented  by: 


where  P„  is  the  signal  power  and  N  is  the  number  of  element  in 
the  system.  This  is  illustrated  in  Fig.  4  for  a  10-element  system 
where  simulation  results  are  plotted  at  each  integer  dB  level  of 
signal  power  separations.  For  most  levels  of  signal  power 
separations  nulling  and  signal  reception  can  be  achieved. 

This  has  been  confirmed  in  seatrials  with  digital  beamforming  on 
the  seven  element  system  where  a  jammer  signal  was  placed  on 
the  tow  boat  while  the  weaker  desired  signal  was  transmitted 
from  a  helicopter.  The  jammer  to  signal  level  was  16  dB  on  the 
elements  so  40  dB  null  depth  was  predicted.  By  using  the  second 
eigenvector,  the  array  signal-to-interference-ratio  was  improved 
by  45  dB  relative  to  the  single  element  signal-to-interference- 
ratio. 

Jamming  and  interference  are  not  expected  in  the  current 
application,  but  if  they  are  expected,  these  results  show  that 
enough  signal  separation  can  be  achieved  to  improve  the 
beamformer  output.  If  nulling  is  desired  it  is  expected  that 
digital  beamforming  and  amplitude  control  would  be  required  for 
better  accuracy  unlike  in  the  current  application. 
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Figure  4.  Simulation  results  show  the  degree  of 
alignment  between  the  second  eigenvector  and  the 
desired  signal  and  the  jammer  interference. 


3.  TRANSMIT  BEAMFORMING  WITH 
FREQUENCY  EXTRAPOLATION 

3.1  Algorithm 

Beamforming  on  transmit  for  the  uplink  to  the  satellite  is  a  much 
more  difficult  task  than  receive  beamforming  on  the  downlink 
because  the  uplink  frequency  is  separated  from  the  downlink 
frequency  by  about  15%.  There  is  no  data  at  the  uplink 
frequency  to  use  for  the  weight  estimation.  There  are,  however, 
many  downlink  frequencies  that  are  constantly  in  use,  so  receive 
weights  can  be  estimated  at  multiple  separated  receive 
frequencies.  The  approach  taken  in  this  design  is  to  estimate 
receive  weights  at  two  separated  frequencies  and  linearly 
extrapolate  the  phase  of  each  element  to  the  transmit  frequency, 
using  the  first  two  terms  of  a  Taylor  series  expansion  of  the 
element  phase. 

3.2  Weight  Bandwidth 

Before  investigating  the  extrapolation,  it  is  useful  to  look  at  the 
effective  bandwidth  of  array  weights.  If  array  weights  are  used  at 
a  frequency  that  is  different  from  the  frequency  where  they  were 
calculated,  they  will  still  work  with  a  small  degradation  as  long 
as  the  frequency  change  is  not  too  big.  The  loss  due  to  the  use  of 
array  weights  with  a  frequency  change  of  Af  can  be  approximated 
by 


L- 


sin2 

f  ft  ^ 

— A/Z,sin0 

Lf. 

\2  ’ 

—Af  sin0 

c  J 


where  c  is  the  speed  of  propagation,  L  is  the  length  of  the  array, 
and  9  is  the  angle  of  arrival  measured  from  broadside  to  the 
array.  This  leads  to  a  weight  bandwidth,  BW,  of  approximately 


BW  = 

Lsin0 

The  15%  change  in  frequency  in  this  application  is  well  beyond 
one  half  of  the  effective  weight  bandwidth  calculated  for 
moderate  arrival  directions  of  60  degrees  off  broadside  and  an 
array  length  of  100  feet.  This  means  that  new  transmit  weights 
will  have  to  be  estimated. 

3 3  Frequency  Extrapolation 

Two  potential  problems  arise  with  frequency  extrapolation  of  the 
element  weights.  The  actual  phase  on  receive  is  only  know 
modulo  2pi  since  that  is  what  comes  out  of  the  receivers.  The 
extrapolation  will  yield  erroneous  results  if  unknown  additions  of 
2pi  are  left  out.  The  other  complication  arises  from  the 
amplification  of  measurement  errors  that  can  occur  with  large 
levels  of  extrapolation.  The  2pi  ambiguity  problem  will  be 
considered  first. 

The  truncated  Taylor  series  of  the  eigenvector  phases  can  be 
expressed  as 


03  =01  + 
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where  0s  are  the  phases,  the  subscripts  refer  to  the  frequencies  (1 
and  2  are  the  lower  and  upper  receive  frequencies  and  3  is  the 
higher  transmit  frequency),  and  As  refer  to  frequency  or  phase 
differences.  The  factor  in  parentheses  on  the  right  will  be 
referred  to  as  M  or  the  extrapolation  ratio,  i.e.. 


The  actual  receive  phases  or  A  phases  on  the  channels  can  be 
expressed  as 


$ actual  =  2701+ <j)measured  n  =  0,±1,±2,... 

If  the  extrapolation  ratio,  M ,  is  restricted  to  be  an  integer  then  the 
value  of  M<t>actuai(mod  2k)  is  the  same  as  2k)  so  it 

does  not  matter  that  we  do  not  know  the  value  of  n.  The  2k 
ambiguity  problem  is  eliminated  by  using  integer  extrapolation 
ratios. 

In  the  current  application  there  are  many  downlink  receive 
frequency  channels  that  can  be  used  as  an  auxiliary  receive 
weight  estimation  frequency  for  the  extrapolation.  An  auxiliary 
channel  can  be  found  such  that  the  extrapolation  ratio  is  close  to 
an  integer  ratio.  The  previous  analysis  of  the  effective  bandwidth 
of  the  weights  shows  that  integer  extrapolation  to  a  frequency 
that  is  close  to  the  uplink  frequency  is  good  enough. 
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The  problem  of  measurement  noise  amplification  in  the 
extrapolation  process  can  be  thought  of  as  arising  from  the 
derivative  in  the  Taylor  series,  since  is  well  know  that 
differentiating  noise  will  amplify  errors.  If  an  assumption  is 
made  that  the  RMS  measurement  phase  noise  is  the  same  at  each 
of  the  two  downlink  receive  frequencies  an  expression  for  the 
phase  error  amplification  factor  can  be  given  by 

A  = 

This  says  that  if  the  RMS  phase  error  at  the  two  downlink  receive 
frequencies  is  <r,  then  the  RMS  phase  error  after  using  an 
extrapolation  ratio  of  M  is  Ac  at  the  uplink  transmit  frequency. 
The  amplification  factor,  A ,  is  always  greater  than  1.0  and  is 
approximately  linear  for  Ms  ranging  from  one  to  five.  A  plot  of 
the  amplification  factor  is  shown  in  Fig.  5.  It  can  be  seen  that  if 
a  certain  phase  error  level  is  required  on  transmit  in  order  to 
minimize  the  beamforming  loss  then  the  phase  error  requirement 
on  the  receive  weights  is  five  times  tighter  for  an  extrapolation 
ratio  of  4.0. 


Figure  5.  The  phase  error  amplification  factor,  A ,  is 
approximately  a  linear  function  of  extrapolation  ratio,  M, 
for  values  of  M  ranging  from  one  to  five. 


It  can  be  calculated  and  it  is  shown  in  Fig.  1  that  if  the  transmit 
beamformer  loss  is  to  be  limited  to  about  1.0  dB,  then  the  RMS 
transmit  phase  error  must  be  limited  to  about  0.5  radians  (28 
deg.).  The  plot  in  Fig.  6  shows  the  limit  of  RMS  receive  phase 
errors  that  must  be  obtained  in  order  to  achieve  the  required  0.5 
radians  at  the  transmit  frequency  after  extrapolating.  It  can  be 
seen  that  transmit  beamforming  becomes  quite  difficult  for  large 
extrapolation  ratios. 


Figure  6.  The  RMS  receive  phase  error  that  is  required 
in  order  to  achieve  1  dB  of  transmit  beamforming  loss  is 
plotted  as  a  function  of  the  extrapolation  ratio,  M. 


3.4  Experimental  Results 

A  field  experiment  has  been  conducted  to  prove  the  concept  of 
transmit  extrapolation  beamforming  using  the  receive-only  test 
array  described  earlier  in  Sec.  2.  Modifications  were  made  to  the 
system  to  accommodate  the  transmit  demonstration  through  a 
receive-only  test  system  by  using  reciprocity.  Four  elements 
were  used.  Signals  were  transmitted  from  the  helicopter  at  three 
frequencies,  representing  the  two  downlink  frequencies  and  the 
one  uplink  frequency.  Adaptive  weights  were  calculated  using 
the  receive  algorithm  at  all  three  frequencies.  The  two  sets  of 
weights  at  the  downlink  frequencies  were  then  extrapolated  to 
estimate  weights  for  the  transmit  frequency.  These  extrapolated 
“transmit”  weights  were  then  compared  to  the  receive  adaptive 
weights  that  were  calculated  directly  at  the  transmit  frequency  by 
performing  digital  beamforming  with  the  two  sets  of  “transmit” 
weights.  The  results  are  plotted  in  Fig.  7  along  with  the  powers 
of  each  of  the  four  individual  channels  for  a  case  with  an 
extrapolation  ratio  of  2.0.  It  is  clear  that  both  sets  of  transmit 
weights  perform  well  with  only  a  small  loss  using  the 
extrapolated  weights  relative  to  the  directly  calculated  (non- 
extrapolated)  weights. 
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Figure  7.  “Transmit”  extrapolation  beamforming  results 
for  an  experiment  using  receive  data  and  reciprocity  for 
an  extrapolation  ratio,  M ,  of  2.0. 


4*  SUMMARY 

An  adaptive  beamforming  system  has  been  designed  and  built 
that  will  enable  submarines  to  communicate  through  satellites 
while  remaining  at  operational  speeds  and  depths.  The  downlink 
receive  beamforming  is  based  on  weights  derived  from  the 
primary  eigenvector  of  the  sample  covariance  matrix.  Only 
moderate  phase  accuracy  is  required  to  provide  both  gain  and 
element  diversity  to  overcome  the  dropouts  in  the  ocean 
environment.  Interference  signals  can  be  spatially  separated  from 
the  desired  signal  from  the  satellite  with  this  type  of  algorithm, 
although  no  interference  is  anticipated  in  this  application. 

The  transmit  uplink  beamformer  cannot  use  the  same  weights  as 
are  used  on  the  receive  downlink  because  the  uplink  and  the 
downlink  frequencies  are  separated  by  about  15%.  A  frequency 
extrapolation  method  is  used  where  the  phases  and  derivatives  of 
phases  with  respect  to  frequency  are  estimated  and  used  to 
project  the  element  phases  to  the  uplink  frequency  by  the  use  of  a 
truncated  Taylor  series.  In  order  to  avoid  2pi  ambiguity 
problems  in  the  extrapolation,  it  is  necessary  that  the  frequency 
gap  to  the  uplink  frequency  be  an  integer  multiple  of  the 
frequency  gap  between  the  two  downlink  frequencies.  An  exact 
integer  ratio  is  not  required  since  the  weights  have  a  reasonably 
sized  bandwidth,  based  on  the  array  length  and  the  signal  angle 
of  arrival.  Much  greater  accuracy  is  required  in  the  receive 
weight  estimation  in  order  to  have  reasonable  accuracy  at  the 
transmit  frequency  after  extrapolation  since  phase  errors  are 
amplified  in  the  extrapolation  process. 

Both  the  receive  and  transmit  algorithms  and  beamforming  have 
been  tested  at  sea  with  good  results. 
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ABSTRACT 

A  common  approach  to  suppressing  jamming  or  RFI  is 
(adaptive)  beamforming,  where  an  antenna  pattern  null 
is  formed  by  appropriately  combining  multiple  receive 
channels .  A  sidelobe  canceller  is  a  common  such 
implementation . 

Beamforming  is  undesireable  when  the  interference 
source  is  in  the  mainlobe  of  the  radar,  because  the 
antenna  pattern  null  created  by  the  beamformer 
produces  a  region  where  ground  imaging  cannot  be 
performed 

This  paper  presents  two  conceptual  alternatives  to 
spatial  beamforming .  The  first  approach  produces  a 
SAR  image  by  combining  the  pulse  returns  from  multiple 
channels  in  a  non-separable  way .  This  u space  time 
beamforming "  is  shown  to  produce  a  null  which  is 
significantly  narrower  and  shallower  than  that  produced 
by  conventional  spatial  beamforming.  Further ,  we 
demonstrate  that  the  space  time  beamforming  null 
becomes  narrower  as  the  length  of  the  synthetic  aperture 
(Le.  the  doppler  resolution )  increases. 

A  second  alternative  to  spatial  beamforming  is  presented 
which  is  useful  when  the  interference  source  is  non-white 
or  when  it  is  desirable  to  estimate  the  (spatially 
localized )  interfering  signal  This  signal  separation 
approach  allows  generic  localized  sources  such  as 
moving  target  signatures ,  vibrating  target  paired  echoes, 
etc.  to  be  separated  from  the  clutter  data. 

1.  INTRODUCTION 

A  typical  approach  to  radio  frequency  interference  (RFI) 
and  jamming  suppression  for  multi-channel  radars  is 
(spatial)  beamforming  [1].  Here,  a  linear  combination  of 
receive  channels  is  used  to  produce  an  antenna  pattern 
null  on  receive  in  the  direction(s)  of  the  interference.  A 
sidelobe  canceller  is  a  common  such  implementation. 


Spatial  beamforming  works  well  when  the  RFI  source  is  in 
the  side  lobes  of  the  radar,  however  in  the  mainbeam, 
spatial  beamforming  produces  a  deep,  wide  notch.  For 
imaging  radars,  this  notch  produces  a  region  where  clutter 
reflectivity  cannot  be  estimated. 

In  this  paper  we  present  two  alternative  approaches  to 
spatial  beamforming.  The  first  uses  non-separable  space 
time  beamforming  to  produce  a  much  narrower,  shallower 
null.  The  second  approach  provides  a  method  for 
separating  the  clutter  and  localized  interference  signals 
when  both  of  these  are  of  interest.  We  compare  the 
performance  of  spatial  beamforming  vs.  space  time 
beamforming  in  terms  of  the  width  and  depth  of  the  clutter 
notch  produced. 

Consider  SAR  image  formation  as  a  problem  of  estimating 
the  radar  cross  section  of  each  range/Doppler  cell  in  the 
presence  of  thermal  noise  and  localized  RFI.  We  consider 
the  width  of  the  region  of  range/Doppler  cells  whose 
Cramer-Rao  variance  bounds  exceed  a  given  threshold. 
We  show  that  the  non-separable  spatial  (multiple 
channels)  and  temporal  (multiple  pulses)  processing, 
produces  a  much  narrower  null  width  (as  defined  above) 
than  conventional  separable  beamforming. 

The  example  results  shown  in  this  utilize  video  phase 
history  data  collected  by  Veridian  System’s  DCS  radar 
with  synthetic  RFI  introduced  prior  to  image  formation. 

The  author  would  like  to  thank  Mike  Beauvais  for  his  help 
with  producing  the  examples  shown  in  the  paper  and  Mark 
Stuff  for  several  interesting  discussions. 

2.  TECHNICAL  DISCUSSION 

2. 1  Spatial  Beamforming 

A  typical  approach  to  RFI  suppression  is  adaptive 
beamforming.  Here,  a  particular  coherent  combination  of 
the  receive  channels  from  a  multi-channel  antenna  is 
chosen  so  as  to  maximize  the  signal  to  interference  plus 
noise  (SINR)  ratio  in  a  particular  steering  direction.  This  is 
illustrated  in  Figure  1  and  Figure  2. 
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Figure  1:  Multi-channel  antenna 


Figure  2:  Beamforming  applied  to  SAR  imaging. 

The  Gauss-Markov  theorem  provides  a  closed-form  for  the 
weight  vector  w  which  maximizes  this  SINR  when  the 
covariance  of  the  interference  and  noise  are  known. 

We  model  our  interference  covariance  as  the  sum  of  a 
spatially  white  thermal  noise  term  with  variance  a\  and  a 
(rank  1)  spatially  localized  RFI  term  which  is  the  outer 
product  of  the  steering  vector  to 

the  RFI  source  with  itself. 


where  t  is  a  tapering  vector  and  .  denotes  the  Hadamard 
(pointwise)  product.  The  weight  vector  is  optimal  when 
t  =  (1,1,...,!)  ,  however  for  purposes  of  sidelobe  reduction, 
a  weighted  taper  is  generally  used. 

A  typical  implementation  of  beamforming  is  a  sidelobe 
canceller.  Here  the  main  subarrays  of  the  antenna  are  used 
for  beamsteering  and  a  small  number  of  auxiliary  channels 
are  then  adaptive  combined  with  the  main  channel  for  RFI 
cancellation  in  the  sidelobes. 

Beamforming  can  be  applied  to  SAR  image  formation  by 
first  forming  a  coherent  (spatial)  combination  of  the  receive 
channels  and  then  passing  this  into  a  SAR  image  formation 
processor  which  then  forms  the  temporal  combination  of 
received  pulses  appropriate  to  scene  reconstruction.  This 
approach  is  shown  in  Figure  2. 

This  separable  spatial-then-temporal  processing  works  well 
when  the  RFI  source  occurs  in  the  sidelobes,  but  has 
undesirable  effects  as  the  RFI  source  enters  the  mainlobe. 
The  -40dB  Taylor  tapered  adapted  antenna  patterns  for 
various  RFI  source  locations  are  shown  in  Figure  3. 


>1  |M» 


Figure  3:  Beamformer  antenna  patterns. 


As  can  be  seen,  an  RFI  source  in  the  sidelobes  has  very 
little  effect  on  the  sidelobe  levels  or  on  the  mainlobe  shape. 
However  as  soon  as  the  source  enters  the  mainlobe,  the 
sidelobe  levels  rise  and  the  mainlobe  distorts.  The  worst 
degradation  occurs  when  the  RFI  source  coincides  with  the 
beamsteering  direction.  In  this  case  a  wide,  deep  notch 
appears  in  the  mainbeam  and  the  sidelobes  are  elevated  by 
20dB. 

2.2  Space  Time  Beamforming 

The  problem  inherent  with  separable  spatial-then -temporal 
beamforming  for  SAR  imaging  is  that  the  optimal  weights 
maximize  the  SINR  only  in  the  steering  direction. 
Simultaneous  maximization  of  SINR  in  all  directions 
inherently  requires  a  non-separable  approach.  To  develop 
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such  an  approach,  we  consider  the  very  simple  DPCA  data 
model  shown  in  Figure  4. 
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Figure  4:  Space  time  signal  model. 

Here  a  received  data  sample  shown  in  the  radar  data  cube 
is  indexed  by  channel  (element),  pulse  and  wavelength  and 
consists  of  a  deterministic  clutter  coefficient  and  a  random 
noise  +  RFI  component.  We  model  the  clutter  as  stationary 
and  thus  dependent  only  on  the  spatial  location  of  the 
receiving  antenna  phase  center.  For  illustration,  we 
consider  the  simple  DPCA  situation  where  the  antenna 
moves  one  phase  center  spacing  between  pulses.  In  this 
case,  the  clutter  coefficient  cm+n  in  xnjn  depends  on  the 
sum  m  +  n . 

Written  as  a  matrix  equation  in  the  case  of  M  =  4  pulses 
and  N  =  3  channels  we  have 
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We  model  the  interference  as  consisting  of  a  spatially  and 
temporally  white  noise  component  and  a  spatially  localized 
and  temporally  white  RFI  source. 
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corresponding  to  the  position  of  the  RFI  source  as  shown 
in  Figure  5. 


Figure  5:  RFI  source  geometry. 

The  Gauss-Markov  theorem  can  be  used  to  construct  the 
best  linear  unbiased  estimator  for  the  clutter  coefficients  in 
this  colored  interference  environment.  The  clutter 
estimator  is  given  by 

c=(Zhr;,z)-'zhr;1x 

The  purpose  of  this  paper  is  to  present  and  compare 
conceptual  approaches  to  RFI  suppression  without 
introducing  actual  algorithms,  however  it’s  worth  noting 
that  the  BLUE  for  clutter  coefficient  estimation  has  a  matrix 
structure  (Figure  6)  which  makes  it  particularly  amenable  to 
solution  using  linear  solvers.  Evaluation  of  the  matrix- 

vector  product  ZhR”!x  amounts  to  evaluating  the  Z- 

transform  of  x  at  various  locations  and  thus  can  be 
efficiently  evaluated  using  the  chirp -Z  transform.  Further, 

it’s  straightforward  to  show  that  the  matrix  ZHR‘  'z  has  a 
banded  matrix  structure  with  upper  and  lower  bandwidths 
N  - 1 ,  thus  efficient  sparse  matrix  solvers  can  be  applied. 


Banded  matrix, 
bandwidth  ~  N-1 


Quickly  evaluated  using 
the  chinxs  transform 


Figure  6:  Space  time  beamforming  matrix  structure. 
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Figure  7  shows  the  results  of  the  separable  spatial 
beamforming  and  non-separable  space-time  beamfoiming 
approaches  applied  to  real  SAR  video  phase  history  with 
synthetic  RFI. 


HoRF\ 

■  *  €  - 

i 

wf?  r  jdp  i 

■Hp 

K£#$l|i  1 1 

Spatial  Beamforming 

Space  Time  Beamforming 

•  .,*•  •  ■  ^  :'/V.S 

V,  .t,  j  ,  * 

A  ;  *, 

.  ◄  RR 

«K:fi:il . . 

Figure  7:  Spatial  vs.  space  time  beamforming. 

This  example  corresponds  to  a  radar  with  a  standoff  range 
of  1 00km,  having  a  3m  antenna.  The  simulation  uses  a  CNR 
of  26dB  and  a  JNR  of  38dB  with  N  —  6  channels  and 
M  =  2750  pulses. 

The  separable  spatial  beamforming  null  is  seen  to  be 
deeper  and  wider  than  its  space-time  counterpart.  In  fact, 
the  SINR  in  the  direction  of  the  RFI  is  actually  worse  than 
had  no  beamforming  been  performed.  This  is  because  the 
separable  beamforming  is  only  optimal  in  the  steering 
direction.  The  space  time  beamformer  is  never  any  worse 
than  the  case  of  no  beamforming  and  recovers  most  of  the 
image  everywhere  but  very  near  the  RFI  source. 


2.3  Beamforming  Comparison 


The  spatial  and  space  time  beamforming  approaches  to  RFI 
suppression  can  be  compared  somewhat  more  rigorously 
by  considering  the  clutter  to  noise  ratios  produced  by 
these  methods  as  a  function  of  the  azimuth  position  of  a 
clutter  patch  and  the  azimuth  position  of  the  RFI  source. 
These  clutter  to  noise  ratios  are  given  by 
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These  clutter  to  noise  ratios  for  the  spatial  and  space  time 
beamforming  are  shown  in  Figure  8. 


Figure  8:  CNR  for  spatial  (left)  vs.  space  time  (right) 
beamforming. 


The  x-axis  corresponds  to  the  azimuth  position  of  the 
clutter  patch  and  the  y  axis  is  the  azimuth  position  of  the 
RFI  over  a  5km  scene.  The  effects  of  the  RFI  position 
(mainlobe  vs.  sidelobe)  on  the  spatial  beamforming  are 
evident  here.  No  antenna  pattern  can  be  seen  for  the  space 
time  beamforming  because  the  individual  subarrays 
patterns  were  not  modeled. 

The  width  of  the  “notch”  produced  by  beamforming  can  be 
defined  in  terms  of  a  minimally  acceptable  CNR  level. 
Figure  9  compares  the  two  approaches  as  the  number  of 
pulses  used  increases  (and  the  doppler  resolution  gets 
finer).  As  can  be  seen,  the  spatial  beamformer  produces  a 
null  whose  depth  is  relatively  independent  of  the  number 
of  pulses  used  and  whose  width  improves  only  slowly  with 
increasing  doppler  resolution.  By  contrast,  the  depth  of 
the  space  time  beamformer  notch  rises  as  the  number  of 
pulses  increases  and  the  width  improves  dramatically  with 
increasing  doppler  resolution. 

This  observation  suggests  that  the  width  of  the  null  is 
proportional  to  the  doppler  resolution  for  space  time 
beamforming,  although  the  author  has  not  proven  or 
disproved  this  as  yet. 


and 


Figure  9:  CNR  comparison  of  spatial  (left)  vs.  space  time 
(right)  beamforming  at  various  doppler  resolutions. 


2.4  Signal  Separation 

Space  time  beamforming  b  a  potentially  useful  technique 
for  suppressing  RFI  in  the  mainlobe,  however  in  some 
situations,  the  interference  may  be  temporally  colored  or 
even  highly  structured.  Further,  for  many  applications,  the 
“RFI”  may  correspond  to  a  spatially  localized  signal  of 
interest.  Such  signals  can  include  covert  RF  tag 
communication  signals,  paired  echoes  from  rotating  or 
vibrating  objects  [3]  or  even  returns  from  moving  targets 
[2,4,5]  where  an  objective  might  be  to  image  the  moving 
targets. 

In  such  situations,  we  would  like  a  method  for  extracting 
the  clutter  signal  from  the  localized  source.  Figure  10 
illustrates  the  distribution  of  clutter  and  localized  source 
energy  in  the  radar  data  cube.  The  relation  between  the 
azimuth  location  of  a  stationary  patch  of  clutter  and  the 
doppler  frequency  it  manifests  causes  the  clutter  energy  to 
concentrate  onto  a  2D  “clutter  ridge”.  The  localized  source 
energy  also  concentrates  onto  a  plane  at  the  azimuth 
location  of  the  source.  It’s  reasonable  to  expect,  then,  that 
these  signals  can  be  separated  except  where  they  intersect 
in  the  data  cube. 


rAzmom 

Figure  10:  Signal  separation  cartoon. 


We  introduce  another  data  model  in  which  both  the  clutter 
and  the  localized  source  are  deterministic  quantities.  The 
interference  in  this  case  is  simply  white  thermal  noise.  As 
before,  we  model  the  clutter  signal  as  depending  only  on 
the  spatial  location  of  the  receiving  phase  center  (Figure 
ii). 
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Figure  11:  Signal  separation  data  model. 

The  localized  source,  on  the  other  hand,  is  modeled  as  the 
product  of  a  temporal  term  sm  depending  only  on  the 

pulse  number,  and  spatial  term  zn  depending  on  the  phase 

center  n  and  azimuth  position  z  of  the  source.  In  matrix 
notation,  we  have 
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Since  our  interference  is  spatially  and  temporally  white,  the 
best  estimator  of  the  clutter  and  signal  coefficients  is  the 
least  squares  solution 


CS  =(ZHZ)_1ZHX 


As  might  be  expected,  the  matrix  Z  is  not  full  rank.  This 
rank  deficiency  corresponds  to  the  intersection  region 
(Figure  10)  between  the  clutter  ridge  and  localized  source. 
This  problem  can  be  corrected  by  introducing  an  extra  row 
in  Z  which  effectively  allows  us  to  specify  whether  the 
inseparable  energy  in  the  intersection  should  be  included 
with  the  clutter  signal  or  the  localized  source  signal. 

Figure  12  shows  the  result  of  applying  this  signal 
separation  technique  to  3  channel  SAR  video  phase 
history  with  synthetic  RFI  and  then  processing  the 
separated  signals  into  SAR  images  using  a  conventional 
image  formation  processor. 


Figure  12:  Signal  separation  example. 


Here  we  used  1024  pulses  and  set  up  the  simulation  to 
have  a  CNR  of  30dB  and  a  JNR  of  40dB.  We  included  the 
overlap  region  with  the  clutter  signal. 

Our  initial  results  suggest  that  the  width  of  the  intersection 
region  wherein  the  clutter  cannot  be  discerned  from  the 
source  is  proportional  to  the  doppler  resolution  of  the 
radar,  and  thus  can  be  made  more  narrow  by  collecting 
more  pulses. 


3.  CONCLUSIONS  +  FURTHER  WORK 

The  purpose  of  this  paper  was  to  suggest  three  conceptual 
approaches  to  the  problem  of  RFI  mitigation  and  more 
generally  the  problem  of  separating  the  clutter  signal  from  a 
localized  source.  It  \*as  shown  that  non-separable  space 
time  beamforming  is  necessary  to  effectively  combat 
mainbeam  RFI. 

Any  practical  implementation  of  these  techniques  would 
have  to  solve  three  problems  not  addressed  by  the  paper. 
The  first  problem  is  determining  the  azimuth  position  of  the 
RFI  or  localized  source.  The  second  is  the  estimation  of 
the  interference  environment  (or  at  very  least,  the  noise 

(7 1  and  RFI  oj  variances).  Lastly,  the  problem  of  channel 

balancing  must  be  addressed.  Innovative  adaptive  signal 
processing  approaches  will  be  required  to  solve  these 
problems. 
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Further  Evaluations  of  STAP  Tests  in  Compound- Gaussian  Radar  Clutter 
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Abstract — The  performance  of  a  parametric  space-time 
adaptive  processing  (STAP)  method  is  presented  here. 
Specifically,  we  consider  signal  detection  in  additive  dis¬ 
turbance  containing  compound-Gaussian  clutter  plus  ad¬ 
ditive  Gaussian  thermal  white  noise.  Performance  is 
compared  to  the  normalized  adaptive  matched  filter  and 
the  Kelly  GLRT  receiver  using  simulated  and  measured 
data.  We  focus  on  the  issues  of  detection  and  false  alarm 
probabilities,  constant  false  alarm  rate  (CFAR),  robust¬ 
ness  with  respect  to  clutter  texture  power  variations, 
and  reduced  training  data  support. 

I.  Introduction 

This  paper  undertakes  a  performance  comparison 
of  several  candidate  space-time  adaptive  processing 
(STAP)  algorithms  in  compound-Gaussian  clutter  for 
airborne  radar  applications.  The  STAP  problem  is 
equivalent  to  hypothesis  testing  on  a  complex  (base¬ 
band)  measurement  (test  data)  vector  x  6  C^with  J 
channels  and  N  time  pulses.  Typically,  x  contains  an 
unwanted  additive  disturbance  d  with  unknown  covari¬ 
ance  matrix  Rj  and  may  contain  a  desired  signal  oe 
with  unknown  complex  amplitude,  a,  and  known  signal 
steering  vector  e.  The  binary  detection  problem  is  to 
select  between  hypothesis  Ho  :  a  =  0  and  H\  :  a  ±  0 
given  a  single  realization  of  x. 

Current  research  [1-10]  is  addressing  the  detection 
problem  wherein  d  contains  partially  correlated  clutter 
described  by  a  product  model  [11].  Here,  the  clutter 
is  modeled  as  a  Gaussian  process  with  random  power 
variations  (scale  changes)  over  range.  This  model  is 
the  basis  of  the  spherically  invariant  random  process 
(SIRP)(or  compound-Gaussian)  clutter  model,  which 
includes  the  Weibull  and  K-distributions  as  special 
cases. 

In  [6, 12, 13]  the  optimal  processor  for  detecting  a 
rank  one  signal  in  SIRP  clutter  was  shown  to  be  equiv¬ 
alent  to  a  matched  filter  compared  to  a  data  dependent 
threshold.  With  a  simple  normalization,  this  test  can 
be  cast  in  the  form  of  the  normalized  matched  filter 
(NMF)  test  compared  to  a  data  dependent  threshold, 
the  calculation  of  which  requires  knowledge  of  the  un¬ 
derlying  clutter  probability  density  function  (PDF).  De¬ 
termination  of  the  clutter  PDF  imposes  onerous  train¬ 
ing  data  requirements.  Consequently,  ad  hoc  methods 
have  been  popular  in  recent  studies  [7-10].  A  popular 


ad-hoc  method  is  the  NMF  test  compared  to  a  data  in¬ 
dependent  threshold,  which  was  independently  derived 
in  [7,8]. 

The  work  of  [14]  (and  references  therein)  considered 
an  invariance  framework  for  hypothesis  testing  in  Gaus¬ 
sian  noise  having  a  covariance  matrix  with  known  struc¬ 
ture  but  unknown  level.  Interestingly,  the  test  statistic 
reported  in  [14]  is  identical  to  the  NMF  of  [7, 8].  The 
work  of  [14]  also  extended  the  NMF  test  to  allow  for  an 
unknown  noise  covariance  matrix  and  unknown  scal¬ 
ing,  r?2,  denoting  the  ratio  of  the  test  and  training  data 
variances.  We  refer  to  this  test  as  the  normalized  adap¬ 
tive  matched  filter  (NAMF).  The  invariance  principle  of 
[14]  (and  references  therein),  and  perforce  constant  false 
alarm  rate  (CFAR),  applies  only  when  all  the  training 
data  vectors  are  scaled  identically  [10]. 

In  SIRP  clutter,  however,  each  training  data  vector 
realization  is  scaled  by  a  different  random  parameter. 
Although  the  NAMF  has  no  known  optimality  prop¬ 
erties  for  SIRP  clutter,  it  has  the  important  feature 
of  minimizing  dependence  upon  texture  power.  Some 
performance  results  of  the  NAMF  in  SIRP  clutter  are 
presented  in  [7,8], 

Multichannel  model-based  (i.e.,  parametric)  methods 
for  target  detection  and  estimation  in  clutter  are  de¬ 
scribed  in  [2-4, 9, 10, 15-17].  In  particular,  a  model- 
based  STAP  method  called  the  non-Gaussian  para¬ 
metric  adaptive  matched  filter  (NG-PAMF),  requiring 
knowledge  of  the  underlying  clutter  statistics  was  first 
proposed  in  [3]. 

In  this  paper,  the  performance  of  the  normalized 
parametric  adaptive  matched  filter  (N-PAMF)  [10,18]  is 
evaluated  and  compared  with  that  of  several  candidate 
STAP  algorithms.  Its  form  is  the  model-based  approx¬ 
imation  of  the  NAMF.  Statistical  equivalence  of  the  N- 
PAMF  test  to  the  NG-PAMF  test  was  shown  in  [10]. 
However,  unlike  the  NG-PAMF,  the  N-PAMF  test  re¬ 
quires  no  ‘a  priori’  knowledge  of  the  disturbance  statis¬ 
tics  [10].  This  feature  is  important  in  real-time  applica¬ 
tions  where  such  information  is  lacking.  Robustness  of 
Pd  over  a  broad  range  of  K-distribution  shape  parame¬ 
ters  (a)  ranging  from  Gaussian  (a  =  oo)  to  high-tailed 
PDF  (a  =  0.1)  is  presented  here.  These  considerations 
enable  assessments  of  CFAR  performance  with  respect 
to  the  amplitude  probability  density  function  (APDF) 
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associated  with  clutter  texture  variations-  Finally,  we 
examine  performance  versus  data  support  size  used  for 
disturbance  estimation.  This  issue  is  of  considerable 
importance  in  applications  where  training  data  support 
is  limited. 

II.  The  Clutter  Model 

Clutter  observed  in  a  single  channel  admits  a  repre¬ 
sentation  of  the  form 

Cfc(n)  -vk(n)gk(n)  (1) 

where  a  complex-Gaussian  process  gk(n)  (speckle  com¬ 
ponent)  corresponding  to  time  n  and  range  cell  k  is 
modulated  by  a  statistically  independent  non-negative 
process  vk(n)  (texture  component),  frequently,  vk(n) 
is  approximated  as  a  random  variable  V  over  k.  but 
constant  over  time  if  it  has  long  temporal  coherence. 
Thus,  (1)  reduces  to  the  representation  theorem  [11] 
for  an  SIRP  and  is  given  by 

Ck(n)  =  vkgk(n).  (2) 

For  the  multichannel  problem,  the  clutter  in  each  of  the 
J  channels  is  given  by  (2).  The  PDF  of  V,  fv(v ),  is  de¬ 
fined  to  be  the  characteristic  PDF  of  the  SIRP.  The 
amplitude  of  ck(n)  is  K-distributed  for  Generalized- 
Chi  distributed  fv(v)  and  includes  the  Gaussian  model 
(a  =  oo)  as  a  special  case.  The  disturbance  d  con¬ 
tains  partially  correlated  clutter  c  modeled  by  a  K- 
distributed  amplitude,  with  PDF 


A.  Non- Adaptive  Test  Statistics 

For  known  Rd,  the  optimal  test  for  detecting  a  rank 
one  signal  in  Gaussian  interference  is  given  by 


|egRd  xx|2 
e«R~d' 


*1 

<  Amf- 

Ho 


(4) 


In  some  instances,  the  test  data  vector  can  have  a  co- 
variance  matrix  ifRd,  where  is  an  unknown  level. 
The  phase  invariant  matched  filter  (PI-MF)  test  for 
these  problems  is  expressed  as  [14] 


A PIMF  ~ 


rpeHR-'e  i0 


XpiMF 


(5) 


where  e  and  x  are  the  concatenated  JN  x  1  signal 
‘search5  steering  and  data  vectors,  respectively.  The 
inner  product  of  whitened  vectors  b  —  R~^x  and 

f  =  RJ5  e  is  the  matched  filtering  operation.  Although 
(5)  does  not  require  knowledge  of  signal  phase,  it  does 
require  knowledge  of  the  level  ??2  to  be  CFAR.  The  op¬ 
timal  test  for  this  problem  is  the  NMF  test  [14]  given 
by 


AjVJWF  = 


legRJxx|2 


Hj 


[etfR/ep^R/x]  h„ 


<  ^nmf-  (6) 


The  test  statistic  of  (6)  is  simply  the  squared  magnitude 
of  the  inner  product  of  the  vectors  f  and  b  normalized 
by  their  squared  norms,  so  that  0  <  Xnmf  <  1. 


ftcc+lra 

f*(r)  =  2°~ir(a)  Ka~l  )  r>  0, /?,  a  >  0  (3) 

where  /?  and  a  are  the  distribution  scale  and  shape 
parameters,  respectively,  K„(.)  is  the  modified  Bessel 
function  of  the  second  kind  of  order  v,  and  T(.)  is  the 
Eulero-Gamma  function.  Small  values  of  a  result  in 
heavy-tails  for  the  PDF  of  (3).  From  (2),  the  clut¬ 
ter  covariance  matrix  is  Rc  =  KgE(V2)  where  R9  € 
CJN*JN  is  the  covariance  of  the  Gaussian  (speckle) 
component  and  E(V2)  relates  to  the  texture  power. 

In  practice,  Rd  is  unknown,  and  must  be  estimated 
from  a  signal-free  JN  x  K  secondary  data  matrix,  Z, 
whose  columns  are  assumed  to  be  statistically  indepen¬ 
dent  and  identically  distributed  (iid)  as  the  test  data. 
For  Gaussian  disturbance,  the  maximum  likelihood 
(ML)  estimator  is  the  sample  matrix  Rd  =  7/LH  jK. 
However,  Rd  is  not  the  ML  estimate  for  compound- 
Gaussian  clutter. 

III.  Test  Statistic  Descriptions 

We  now  consider  several  non-adaptive  and  adaptive 
detection  test  statistics  in  this  section. 


B.  Adaptive  Test  Statistics 

For  the  adaptive  problem,  Rd  replaces  Rd.  Conse¬ 
quently,  the  adaptive  version  of  the  test  of  (4)  denoted 
as  the  AMF  is  given  by 


A  AMF  = 


le^xl2 

e^R-'e 


H! 

<  A  amf- 

H0 


(7) 


Observe  that  A amf  is  simply  the  adaptive  version  of 
Amf-  For  the  special  case  of  77  =  1,  this  test  was  devel¬ 
oped  independently  in  [19,20]  where  its  CFAR  behavior 
was  noted.  This  property  is  lost  when  77  ^  1. 

The  adaptive  version  of  the  test  of  (6)  is  given  by 


An  amf  ~ 


|e-ffRj1x|2 


[«ffR/e][xHR/x]  «o 


<  A NAMF ■  (8) 


Another  adaptive  detection  test  known  as  the  Kelly 
GLRT  [21]  is  expressed  as 


A glrt  = 


le^R-^l2 


[e«R-1e][l  + 


x*r; 


-1 


<  ATA glrt  (9) 

Ho 
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where  0  <  A glrt  <  1.  For  K  oo,  the  tests  of  (7) 
and  (9)  converge  to  the  test  of  (4),  whereas  the  test  of 
(8)  converges  to  that  of  (6). 

In  this  paper,  we  consider  the  performance  of  the 
tests  of  (7)-  (9)  in  compound- Gaussian  clutter.  No  op¬ 
timality  or  CFAR  claims  of  these  tests  can  be  made  for 
the  case  of  SIRP  disturbance. 


C.  Model-Based  STAP  Tests 


For  multichannel  model-based  methods  [15],  the 
whitening  operation  is  performed  using  prediction  error 
filters  (PEF)  involving  time  series  or  state  space  archi¬ 
tectures.  We  define  y p(n)  as  the  Jxl  vector  error  resid¬ 
ual  output  of  a  P£/l-order  multichannel  linear  filter.  For 
a  multi-channel  (vector)  autoregressive  model,  a  tapped 
delay  line  architecture  is  used  where  the  Pth  order  filter 
coefficients,  A(fc),  are  estimated  from  Z  using  a  multi¬ 
channel  parameter  estimation  algorithm.  These  J  x  J 
matrix  coefficients  provide  a  succinct  description  of  the 
spatio-temporal  disturbance  correlation.  Specifically, 


yp(n)  =  D0  2  L0  1np(n) 

p 

=  D0  5Lo  1 lfr(n|Jii)  +  E  Mk)x(n  -  k  +  P|JSTX)] 
n  =  0,l,.  "JV-P-I 

(10) 

where  (10)  implicitly  defines  the  temporally  whitened 
Jxl  error  residual  u P(n),  with  covariance  £u.  In 
practice,  £u  is  unknown  and  the  estimation  algorithms 
produce  its  estimate  £u.  However,  this  paper  em¬ 
ploys  £u  obtained  by  averaging  the  outer  products  of 
up(n),  where  u p(n)  results  from  a  transformation  of 
the  secondary  data  by  the  prediction  error  filter  with 
fixed  A (k).  The  LDLH  decomposition  of  £u  yields 
(L0,Do)  which  are  used  to  spatially  whiten  up(n)  [15]. 
Finally,  y p(n)  is  obtained  by  a  transformation  of  x(n) 
by  the  multichannel  prediction  error  filter  with  coeffi¬ 
cients  (L0,Do)  and  A(k)  as  denoted  in  (10).  Similarly, 
the  transformed  steering  vector  s(n)  is  obtained  by  se¬ 
quencing  the  sequential  form  of  the  ‘search’  steering 
vector  e(n)  through  the  PEF  [3, 16].  The  normalized 
parametric  adaptive  matched  filter  (N-PAMF)  [10, 18] 
is  now  defined  as 


^N-PAMF 


N—P—l 

E  sII(n)yp(n ) 

71=0 

2 

TV— P—1 

E  sH(n)s(n) 

.  n= 0 

-JV-F-I 

E  y?(n)y p(n) 

.  n=0 

(11) 

A  related  parametric  adaptive  matched  filter  (PAMF) 
test  was  first  derived  in  [3]  for  Gaussian  disturbance. 


The  PAMF  test  is  identical  to  (11)  but  excludes  the 
second  bracketed  denominator  term.  In  [16],  several 
estimation  algorithms  are  considered  in  the  PAMF  im¬ 
plementation  for  Gaussian  disturbance.  In  this  paper 
the  multichannel  least  squares  method  is  used  for  filter 
parameter  estimation. 


IV.  Analytical  Results 

The  probability  of  false  alarm  and  probability  of  de¬ 
tection  for  NMF  operating  in  compound-Gaussian  clut¬ 
ter  (without  background  white  noise)  are  given  by  [9] 

P fa— NMF  =  P(&NMF  >  XnMf\Hq)  —  (1-A NMfV*'1 

(12) 

Pd— NMF  =  1  ”  (1  ~  A NMF)JN  1 

x  5>  ‘  -  W)]  (13) 

k=l 

F(jjv) 

where  <?(.)  is  defined  in  [9]  and  bk  =  r('ifc  +  i)r'(jjV- tj' 

The  expressions  for  P/a  and  Pd  for  the  NAMF  oper¬ 
ating  in  Gaussian  clutter  are  given  by  [9] 


P fa- NAMF  =  / 

Jo 


fr(y) 


[1  +  (1  —  7  )^namf]l 


d'y  (14) 


Pd- 


NAMF 


[1  +  (1  -  j)Xnamf)l 

m= 1  v  7 

(  A(  1  —  7)  \ 

x[l -gammainc  ^ - (1 

(15) 

where  T,  gammainc(.)  and  /r(7)  are  defined  in  [9]. 

Corresponding  Pfa  and  Pd  expressions  for  the  NAMF 
operating  in  SIRP  clutter  are  difficult  to  derive.  This  is 
due  to  the  fact  that  the  ML  estimate  of  the  SIRP  covari¬ 
ance  matrix  is  not  available  in  closed  form  [22].  Hence, 
the  NAMF  PDF  cannot  be  determined  analytically. 

The  Pfa  and  Pd  expressions  for  the  N-PAMF  and 
PAMF  operating  in  both  Gaussian  and  non-Gaussian 
clutter  scenarios  are  lacking  since  the  analysis  be¬ 
comes  mathematically  intractable.  Consequently,  per¬ 
formance  evaluation  of  these  methods  is  carried  out  by 
Monte-Carlo  techniques. 


V.  Performance  Results 

Performance  is  now  presented  for  the  detectors  de¬ 
scribed  above.  Probability  of  detection  (Pd)  is  com¬ 
puted  for  a  Pfa  of  0.01  via  100,000  Monte-Carlo  tri¬ 
als  using  a  physical  model  of  an  airborne  radar  sce¬ 
nario  [4].  The  target  signal  is  located  at  a  normal¬ 
ized  Doppler  frequency  fdt  —  0.15  (unless  otherwise 
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Fig.  I-  Pd  versus  output  SINK,  in  Gaussian  disturbance  Fig.  2.  Pd  versus  SINR  in  K-distributed  Clutter  a  =  0.5 


stated)  and  azimuth  (j>  =  0.  The  clutter  ridge  is  located 
along  the  normalized  angle-Doppler  plane  diagonal  with 
a  40  dB  (per  pulse,  per  channel)  clutter-to-noise  ra¬ 
tio  (CNR).  The  one-lag  clutter  temporal  correlation  pa¬ 
rameter  [15]  is  0.999.  Disturbance  correlation  estimates 
are  obtained  using  K  secondary  data  cells.  The  output 
signal-to-interference  plus  noise  ratio  (SINR)  is  defined 
as  SINR  =  la^e^R^e.  All  performance  results  are 
obtained  with  compound-Gaussian  clutter  plus  additive 
thermal  white  noise. 

Figure  1  shows  Pd  versus  output  SINR  for  Gaussian 
disturbance  with  rj1  —  1.  Shown  here  are  analytical 
Pd  plots  for  the  MF  and  NMF  with  known  Hd.  The 
analytical  NAMF  Pd  curve  is  also  shown  for  K=128. 
Monte-Carlo  generated  Pd  results  for  the  PAMF(MLS), 
N-PAMF  (MLS),  NAMF,  and  Kelly  GLRT  are  depicted. 
Performance  of  the  Kelly  GLRT  and  the  NAMF  are 
indistinguishable  for  this  example.  Note  that  the  N- 
PAMF  method  with  P  =  3,  nearly  achieves  the  NMF 
performance  for  low  sample  support  size  K=12.  Sin¬ 
gularity  of  Rd  for  K=12  precludes  implementation  of 
the  AMF,  NAMF,  and  Kelly  GLRT.  Figures  2  and  3 
display  Pd  versus  output  SINR  for  the  NMF,  NAMF, 
N-PAMF(MLS),  Kelly  GLRT  and  AMF  receivers  for 
clutter  processes  with  shape  parameters  a  =  0.5  and 
a  =  0.1,  respectively.  Observe  the  robust  behavior 
of  the  N-PAMF,  NAMF,  Kelly  GLRT  in  compound- 
Gaussian  clutter.  The  Kelly  GLRT  outperforms  the 
NAMF  at  high  SINRs,  whereas  this  condition  reverses 
at  low  SINRs.  Figure  4  plots  Pd  versus  the  clut¬ 
ter  shape  parameter  a  at  output  SINR=6dB  with  a 
ranging  from  0.1  to  1,000.  For  the  K-distribution, 
a  >  4  approximates  the  Gaussian  case.  The  results 
reveal  the  robustness  of  the  N-PAMF  and  NAMF  tests 
over  a  wide  range  of  shape  parameters.  However,  the 
N-PAMF(MLS)  shows  superior  performance  approach¬ 
ing  that  of  the  NMF.  Performance  of  the  PAMF  and 
AMF  degrade  in  heavy-tailed  compound-Gaussian  clut- 


Fig.  3.  Pd  versus  output  SINR  in  K-distributed  Clutter  a  =  0.1 


ter.  However,  for  a  >  100  (Gaussian  region),  they 
incur  no  performance  degradation.  Figure  5  shows 
Pfa  versus  shape  parameter  a  with  each  test  statistic 
threshold  held  fixed  to  obtain  Pfa  =  0.01  for  Gaus¬ 
sian  disturbance  (a  =  oo).  A  significant  increase  in 
Pfa  for  the  NAMF  and  Kelly  GLRT  confirms  their  lack 
of  CFAR  with  respect  to  texture  variations.  Figures 
6,  7,  and  8  depict  plots  of  P/a  versus  threshold  for 
the  Kelly  GLRT,  NAMF,  and  N-PAMF,  respectively, 
for  several  K-distribution  shape  parameter  values.  The 
curves  for  the  N-PAMF  exhibit  much  lower  variability 
compared  to  the  Kelly  GLRT  and  NAMF,  reflecting 
a  robust  CFAR  performance  with  respect  to  the  tex¬ 
ture  PDF.  Figures  9  and  10  plot  the  test  statistic  vs 
range  cell  using  data  from  the  Air  Force  Research  Labo¬ 
ratory  (AFRL)  Multichannel  Airborne  Radar  Measure¬ 
ment  (MCARM)  program  with  an  inserted  target  signal 
at  range  bin  index  310.  Specifically,  data  correspond¬ 
ing  to  acquisition  ‘2205  from  flight  ‘5’  cycle  ‘e’  is  used  in 
the  examples  presented  here.  For  these  results,  we  use 
J— 8  and  N=32.  We  define  the  performance  measure  $1 
as  the  ratio  of  the  test  statistic  at  the  test  cell  to  the 
mean  of  the  test  statistics  formed  from  adjacent  cells, 
and  as  the  ratio  of  the  test  statistic  at  the  test  cell 
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Fig.  4.  Pd  versus  shape  parameter  (a) 


Fig.  5.  Pfa  versus  clutter  shape  parameter(a):fixed  threshold 


to  the  highest  test  statistic  formed  from  adjacent  cells. 
Figure  9  plots  the  test  statistics  for  the  NAMF  with 
K=512  and  N-PAMF(MLS)  (P=2)  with  K=16.  Fig¬ 
ure  10  plots  the  test  statistics  for  the  Kelly  GLRT  with 
K=512  and  the  N-PAMF  (MLS)(P=2)  with  K=16. 
Table  1  shows  and  $2  for  several  values  of  K  and  P 
using  the  N-PAMF.  Note  that  the  N-PAMF  with  P=2 
and  K=16  provides  the  best  performance  for  this  sce¬ 
nario.  In  this  instance,  large  sample  support  does  not 
result  in  improved  performance  due  to  training  data 
nonhomogeneity. 


P 

(dB) 

$2  (dB) 

NAMF  (K=512) 

16.2 

7.45 

Kelly  GLRT  (K=512) 

16.71 

8.03 

N-PAMF  (K=512) 

4 

19.3 

12.3 

N-PAMF  (K=32) 

3 

22.1 

14.7 

N-PAMF  (K=16) 

3 

21.6 

15.2 

N-PAMF  (K=32) 

2 

21.8 

11.3 

N-PAMF  (K=16) 

2 

22.4 

14.7 

Table  1:  Values  of  and  for  the  N-PAMF  (MLS), 
NAMF,  and  Kelly  GLRT 


Fig.  6.  Pfa  versus  threshold  (Xglrt)  for  the  Kelly  GLRT 


Fig.  7.  Pfa  versus  threshold  for  the  NAMF 


Fig.  8.  Pfa  versus  threshold  for  the  N-PAMF 


Fig.  9.  Test  Statistic  versus  Range 
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Fig.  10.  Test  Statistic  versus  Range 


VI.  Summary  and  Future  Research 

In  this  paper,  we  have  evaluated  the  performance  of 
the  N-PAMF.  NAMF,  and  Kelly  GLRT  in  compound- 
Gaussian  clutter  disturbance.  The  robust  detection  per¬ 
formance  of  these  methods  was  shown  over  a  wide  range 
of  clutter  texture  power  variations  (shape  parameters) 
for  K-distributed  clutter  processes.  Performance  of  the 
N-PAMF  was  found  to  be  close  to  the  NMF.  Next,  the 
CFAR  behavior  was  considered  by  observing  the  detec¬ 
tion  threshold  variations  with  respect  to  shape  param¬ 
eter.  Additionally,  we  demonstrate  the  robustness  of 
the  N-PAMF  method  with  respect  to  small  sample  sup¬ 
port  size  K  (secondary  data  cells)  used  to  estimate  the 
disturbance  correlation.  This  property  is  significant  in 
operational  scenarios  involving  range  varying  nonsta¬ 
tionary  clutter  which  severely  limits  the  availability  of 
representative  training  data.  Examples  with  real  data 
illustrate  the  potential  for  considerable  performance  im¬ 
provement  of  the  N-PAMF  over  the  NAMF  and  Kelly 
GLRT. 
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ABSTRACT 

Surface-wave  over-the-horizon  radars,  especially  ones  located  in 
tropical  areas,  such  as  Northern  Australia,  are  usually  strongly  af¬ 
fected  by  external  impulsive  noise.  Apart  from  thunderstorm  activ¬ 
ity,  man-made  (industrial)  noise  over  typically  quite  long  coherent- 
integration  time  often  is  of  impulsive  nature  as  well. 

In  this  paper  we  analyse  the  efficiency  of  temporal  and  spatial 
adaptive  techniques  for  impulsive  noise  mitigation.  We  demon¬ 
strate  that  for  heavily  contaminated  dwells,  new  spatio-temporal 
adaptive  processing  is  most  effective.  Initial  impulsive  noise  mit¬ 
igation,  produced  by  adaptive  spatial  processing  is  used  for  range 
and  azimuth  dependent  sea-clutter  spectrum  estimation.  Estimated 
sea-clutter  spectrum  is  then  used  to  “restore”  the  “missing”  data, 
originally  contaminated  by  impulsive  noise. 

1.  DESCRIPTION  AND  ANALYSIS  OF  MITIGATION 
TECHNIQUES 

The  High  Frequencv  Over* the- Horizon  Radar  (HF  OTHR)  proba¬ 
bly  constitutes  the  most  prominent  example  of  radars  subjected  to 
severe  impulsive  noise  interference.  Tropical  thunderstorms  which 
are  extreme  I  >  active  in  equatorial  regions  such  as  Northern  Aus¬ 
tralia,  typical l>  generate  a  significant  number  of  lightning  strikes 
within  the  operational  range  of  HF  OTHR  due  to  relatively  long 
coherent  processing  intervals  In  [2]  based  on  experimental  data 
collected  in  Northern  Australia,  we  introduced  point  process  mod¬ 
els  for  atmospheric  noise  adequate  to  spatial  and  temporal  adap¬ 
tive  impulsive  noise  mitigation  It  has  been  suggested  that  optimal 
mitigation  technique  should  incorporate  both  spatial  and  temporal 
domains  based  or  the  properties  of  particular  lightning  strike. 

Our  recent  ex  pen  mental  trial  conducted  from  May  to  Septem¬ 
ber  2000  in  Northern  Australia  revealed  that  accidental  human- 
made  noise  that  quite  often  interferes  with  a  HF  radar,  is  in  most 
cases  also  highly  nonstationary.  The  atmospheric  strike  typically 
occupies  a  single  repetition  period  or  at  most  a  few  consecutive 
repetition  periods  ( for  high  air-mode  waveform  repetition  frequen¬ 
cies  WRF=40  -  60  Hz),  man-made  impulsive  interference  typically 
occupies  significantly  longer  intervals,  measured  in  dozens  of  rep¬ 
etition  periods  (sweeps).  Typical  examples  of  atmospheric  and 
man-made  impulsive  noise  are  presented  in  Fig.  1,  2.  The  am¬ 
plitude  of  the  range  processed  data  at  the  output  of  one  particular 
beam  are  shown  for  different  ranges  (y-axis)  as  a  function  of  repe¬ 


tition  period  (x-axis).  One  can  see  significant  difference  in  number 
of  sweeps  contaminated  by  atmospheric  and  man-made  impulsive 
noise.  Another  important  feature  demonstrated  by  these  figures  is 
the  availability  of  “sea  clutter-free  ranges”.  These  ranges  allow 
for  straight-forward  identification  of  sweeps  affected  by  impulsive 
noise. 

Obviously,  analysis  of  impulsive  noise  mitigation  efficiency 
should  be  expanded  to  man-made  interference.  Indeed,  since  only 
up  to  30%  of  entire  dwell  is  typically  corrupted,  there  is  a  reason 
to  compare  spatial  techniques  with  temporal  onesfl]. 

In  this  paper  we  introduce  comparative  analysis  of  different 
temporal  and  spatial  adaptive  techniques,  suitable  for  impulsive 
noise  mitigation. 

Since  the  actual  interval  corrupted  by  impulsive  noise  is  easily 
identified,  temporal  techniques  are  focused  on  a  proper  estima¬ 
tion  of  the  missing  sea-clutter  data.  For  surface-wave  radars  with 
typically  very  high  sub-clutter  visibility  that  can  range  far  above 
60  dB,  an  accurate  estimation  can  become  a  problem. 

To  address  this  problem  two  major  approaches  could  be  adopted. 
The  first  one  is  based  on  classical  Weiner  prediction  filter.  Compli¬ 
cated  nature  and  range/azimuth  variability  of  sea-clutter  Doppler 
spectrum  impose  limitations  on  the  actual  accuracy  of  this  ap¬ 
proach. 

The  second  technique  is  based  on  direct  optimization  of  re¬ 
placement  data  to  minimize  the  total  power  within  the  specified 
range  of  Doppler  frequencies  which  are  expected  to  be  free  of  sea 
clutter.  This  technique  has  a  different  limitations,  especially  when 
the  number  of  missing  data  is  quite  large  and  consecutive.  How¬ 
ever,  in  attempt  to  minimize  the  overall  power,  strong  targets  could 
be  suppressed  and  some  important  features  of  the  sea-clutter  spec¬ 
trum  could  be  significantly  damaged.  Spatial  techniques  are  effec¬ 
tive  when  strong  impulsive  interference  impinges  on  a  beampat- 
tem  sidelobes.  Meantime,  when  the  entire  coverage  is  important, 
there  would  always  exist  directions  corrupted  by  impulsive  noise 
propagated  via  the  main  beam. 

Comparative  analysis  of  the  above  mentioned  techniques  was 
done  firstly  on  uncorrupted  SW  OTHR  data  with  subclutter  visi¬ 
bility  close  to  the  limit.  One  selected  example  is  shown  in  Fig. 
3.  A  certain  number  of  sweeps  has  been  nominated  as  being  “cor¬ 
rupted”  and  two  abovementioned  temporal  techniques  have  been 
used  to  restore  the  “missing”  data. 

In  order  to  apply  the  classic  prediction  (interpolation)  approach, 
we  first  estimated  the  sea-clutter  temporal  covariance  matrix.  With 
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N  =  1000  repetition  periods  typically  used  in  ship  mode,  we  se¬ 
lected  M  <  Nj 2,  M  =  400  as  a  dimension  of  prediction/interpolation 
filter  in  expectation  that  whatever  the  actual  number  of  missing 
repetition  periods  is,  there  still  should  be  a  sufficient  number  of 
uncorrupted  repetition  periods  (sweeps)  within  corresponding  bi¬ 
variate  “sliding  window”  of  our  prediction  filter.  The  M-variate 
(range-dependent)  sea-clutter  covariance  matrix  is  estimated  here 
by  forward-backward  averaging: 

N-M+ 1 

&»-  £  [Xf{Xf)H  +  JX*jXfj)  (1) 

i=i 
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and  is  the  complex  number  that  corresponds  to  j-th  repe¬ 
tition  period  and  d-th  range  cell.  Particular  beam  number  is  not 
essential  for  this  temporal  processing. 

Let  us  introduce  an  M  x  (M  —  m)  variate  incidence  matrix 
Hm  that  is  constructed  as  standard  identity  matrix  with  m  deleted 
rows  at  positions  that  correspond  to  the  “missing”  sweeps.  Then 
the  adaptive  prediction  filter  that  generates  an  estimate  of  the  k-th 
missing  data  is  defined  as 


wt  =  \HlRdHm\  ’  Hmri  k  =  l,...,jn  (4) 

where  rjj?  is  k-th  column  of  the  M-variate  matrix  Rd. 

Correspondingly,  the  estimate  xf,  k  -  1, . . . ,  m  of  th  k-th 
missing  sweep  is  defined  as 

xi  =  WiHHlXd,  k  =  (5) 

Our  second  approach  is  based  on  direct  search  for  the  m-variate 
vector  Xjn  for  “missing”  data  that  with  respect  to  the  remaining 
(iV  —  m)  “valid”  data  results  in  the  minimal  total  power  within 
some  designated  range  of  Doppler  frequencies. 

Specifically  let  us  present  the  overall  N-variate  vector  Xd  as 

Xd  ~  Xq  +  A mXjn  (6) 

here  Xd  is  a  N-variate  vector  with  zeroes  the  positions  of  “miss¬ 
ing”  data,  Am  is  N  x  m-variate  incidence  matrix,  where  rows  of 
the  m-variate  matrix  are  “spread”  over  N  rows,  corresponding  to 
the  positions  of  the  missing  data. 

Weighted  Discrete  Fourier  Transform  (DFT)  over  the  vector 
Xd  could  be  presented  as 

FD(Xo  +  AmXm)  (7) 

and  with  (N  —  n)  x  N  selection  matrix  S,  the  (TV  -  m)-variate 
vector  of  selected  Doppler  bins  within  the  d-th  range  Doppler  spec¬ 
trum  could  be  presented  as 

SFD(Xo  +  Am£m)  (8) 

where  F  is  the  N-variate  DFT  matrix,  D  is  a  diagonal  weighting 
matrix  (e.g.  Blackman  window). 


Finally,  the  overall  power  within  this  Doppler  window  could 
be  presented  as 

XI fHDFHSSTFDXg  +  xS,A%DFHSSTDAmzm  +  (9) 

crx*A%DFHSSTFDXZ  +  X?1  DFH SST FDAmxm.  (10) 

Correspondingly  the  optimum  solution  is 

£m  =  -lt&DFHSSTFDAm]-'A&DFHSSTDXg.  (11) 

(For  rank  deficient  matrix  [A^DFH  SST  FDAm]  this  solution  is 
modified  to  operate  on  signal  subspace  of  this  matrix.)  Now  these 
techniques  could  be  compared.  Fig  4  presents  the  Doppler  spectra 
for  m  =  100  of  “missing”  data  for  one  range  cut.  Both  random 
(atmospheric  like)  and  continuous  (man-made  like)  distributions 
of  “missing”  data  within  the  400  sweeps  long  window  have  been 
analysed.  Different  number  of  missing  sweeps  have  been  analysed, 
m  ~  1, 40, 60, 100,  however  only  m  =  100  continuous  case  pro¬ 
cessed  with  optimization  filter  is  shown  (the  only  one  which  shows 
any  difference  from  the  original). 

The  results  demonstrate  that  for  randomly  distributed  “miss¬ 
ing”  data  both  techniques  provide  equally  good  restoration.  The 
prediction  errors  are  equally  small  and  sub-clutter  visibility  is  re¬ 
stored  to  the  original  level  in  this  case.  However  in  the  case  of 
continuous  “missing”  data  both  methods  work  equally  well  only 
for  a  small  number  of  “missing”  sweeps.  For  increased  number  of 
consecutive  missing  data  the  difference  between  these  two  tech¬ 
niques  becomes  more  significant.  While  classical  prediction  is 
still  efficiently  restoring  missing  data  (up  to  100  of  missing  data 
for  400- variate  prediction  filter),  optimization  (11)  generates  esti¬ 
mates  Xm  significantly  different  from  the  true  missing  ones.  These 
estimates  lead  to  reduction  in  overall  noise  power  within  the  spec¬ 
ified  Doppler  area,  but  the  overall  structure  of  the  Doppler  spec¬ 
trum  changes  significantly.  For  most  practical  applications  these 
changes  could  not  be  tolerated.  Moreover,  with  significant  num¬ 
ber  of  “degrees  of  freedom”,  total  power  minimization  could  con¬ 
siderably  reduce  the  target  signal  as  well.  Thus,  for  a  randomly 
distributed  missing  data  or  small  number  of  consecutive  missing 
data  (up  to  20  consecutive  sweeps)  the  optimization  technique  (11) 
could  be  recommended  as  a  preferred  option  since  it  does  not  in¬ 
volve  (adaptive)  sea-clutter  spectrum  estimation.  For  typical  man¬ 
made  impulsive  interferences,  this  approach  is  not  appropriate  and 
attention  should  be  attracted  to  a  practical  implementation  of  adap¬ 
tive  prediction  technique  (4)-(5).  Main  problem  here  is  to  get 
an  accurate  enough  estimate  for  the  sea  clutter  covariance  matrix 
Rd.  Since  the  dimension  of  this  matrix  (prediction  filter)  should 
be  significantly  greater  than  the  number  of  missing  sweeps  -  real 
(corrupted)  data  should  not  be  used  directly  for  sample  matrix  es¬ 
timation  (1)  in  the  way  it  has  been  done  in  our  previous  study 
with  uncorrupted  data.  Since  all  ranges  are  usually  equally  cor¬ 
rupted  by  impulsive  noise,  spatial  diversity  could  be  explored  to 
assist  sea-clutter  covariance  matrix  estimation.  Indeed,  in  most 
cases  truly  dominant  impulsive  noise  sources  are  well  localized 
and  even  with  conventional  beamformer  it  is  usually  possible  to 
find  the  least  contaminated  direction  (beam).  While  Fig  2  displays 
the  range  map  for  the  most  occupied  beam,  the  top  line  in  Fig  6 
demonstrates  distribution  of  this  impulsive  noise  power-to-noise 
ratio  across  the  beams.  It  is  quite  obvious,  that  in  the  “minimal” 
beam  (N=7)  the  power  of  this  noise  is  significantly  smaller  and 
range  processed  data  of  this  beam  could  be  used  for  covariance 
matrix  estimation.  Obviously,  adaptive  spatial  processing  is  even 
more  effective  in  terms  of  reduction  of  antenna  pattern  sidelobes 


20 


affected  by  impulsive  noise.  The  bottom  line  in  Fig  6  presents  the 
similar  impulsive  noise  to  white  noise  ratio  as  a  function  of  beam 
direction  for  spatial  adaptive  processing  (SAP).  Here  sea  clutter- 
free  ranges  are  used  to  estimate  sample  spatial  covariance  matrix 

=  Y  xzxzH  02) 


*5  =  (*&- •*2?)T  (13) 

9  is  the  sea-clutter  free  ranges  area  and  SAP  beamformer  is  defined 
as  usual  by 


WSAP(l) 


(14) 


with  Si  as  a  32-variate  steering  vector. 

The  most  important  issue  that  needs  to  be  addressed  to  jus¬ 
tify  this  approach  is  sensitivity  of  this  technique  with  respect  to 
sea-clutter  azimuthal  variability.  Indeed,  we  are  prepared  to  use 
the  sea  clutter  training  data  collected  at  the  output  of  one  particu¬ 
lar  (adaptive)  beam,  but  apply  it  to  quite  a  different  (conventional) 
beamformer  output  In  order  to  investigate  the  efficiency  of  such 
technique,  we  analysed  our  “clean”  data  with  nominated  “missing” 
data.  For  Weiner  prediction  filter  there  is  no  visible  difference  for 
one  range-doppler  cut  between  the  original  data  and  restored  ones 
even  for  quite  large  number  of  missing  data  (m=100),  regardless 
of  the  fact  if  the  data  are  random  or  consecutive.  For  the  opti¬ 
mization  filter  the  same  can  be  said  for  random  distribution  of  the 
replaced  sweeps.  However  the  same  cannot  be  said  for  consecu¬ 
tive  sweeps  as  the  inverted  matrix  becomes  ill  conditioned.  The 
Fig  4  demostrates,  there  is  some  degradation  of  the  signal  in  the 
“sea  clutter”  area. 

Finally  we  illustrate  the  practical  efficiency  of  our  approach 
buy  processing  the  real  contaminated  data  shown  in  Fig  1,2. 

For  the  data  shown  in  Fig  1  optimization  technique  (11)  should 
be  appropriate  since  only  one  dominant  “strike”  is  recorded  here. 

The  results  of  this  technique  applied  to  one  beam  Fig  5  demon¬ 
strate  quite  impressive  improvement  in  sub-clutter  visibility.  Par¬ 
ticular  range  profile  shown  in  Fig  7  demonstrates  35  dB  improve¬ 
ment  in  sub-clutter  visibility,  while  the  standard  SAP  has  delivered 
only  mere  8-10  dB. 

Our  second  example  deals  with  the  man-made  impulsive  noise, 
as  per  Fig  2.  While  SAP  is  practically  not  effective  for  the  data 
collected  at  the  direction  of  impulsive  noise  arrival  (beam  4)  it 
provides  quite  reasonable  improvement  for  beam  7  data  that  thus 
could  be  used  as  training  one.  The  above  described  approach  with 
adaptive  prediction  filter  “trained”  on  beam  7  data  and  applied  to 
beam  four  data  is  illustrated  by  Fig  8.  The  particular  range  pro¬ 
file  demonstrates  improvement  in  sub-clutter  visibility  up  to  20 
dB.  Note  that  quite  significant  part  of  coherent  processing  interval 
(CPI)  has  been  contaminated  here.  Interestingly  enough,  when  the 
same  prediction  filter  is  applied  to  the  training  data  of  the  beam 
7  (Fig  9),  still  considerable  additional  improvement  with  respect 
to  the  SAP  processing  has  been  obtained.  The  reason  behind  be¬ 
comes  clear  when  sea  clutter  free  ranges  processed  by  SAP  are 
analysed:  impulsive  noise  residues  are  still  some  10  dB  above  the 
ambient  noise  floor.  Therefore  the  replacement  of  these  corrupted 
repetition  intervals  by  predicted  ones  results  in  additional  improve¬ 
ment  in  sub-clutter  visibility. 


2.  CONCLUSION 

Analysis  of  selected  temporal  and  spatial  adaptive  techniques  for 
atmospheric  and  man-made  impulsive  noise  mitigation  for  SW 
OTHR  has  been  performed.  It  has  been  demonstrated  that  tem¬ 
poral  or  spatial  only  processing  could  be  effective  only  in  special 
cases.  For  contaminated  repetition  periods  which  are  randomly 
distributed  over  CPI,  direct  optimization  method  is  shown  to  be 
very  effective.  For  some  beam  directions  affected  by  impulsive 
noise  via  antenna  pattern  sidelobes,  standard  spatial  adaptive  pro¬ 
cessing  could  be  also  quite  effective  on  its  own.  In  more  general 
case,  when  the  number  of  contaminated  sweeps  is  quite  signifi¬ 
cant  and  consecutive  (man-made  impulsive  noise),  and  impulsive 
noise  must  be  rejected  in  all  directions,  proposed  spatio-temporal 
adaptive  processing  is  shown  to  be  most  effective.  Here  spatial 
(adaptive)  processing  is  used  for  initial  impulsive  noise  mitiga¬ 
tion,  and  the  beam  where  this  reduction  is  maximal  is  used  as  a 
training  one  for  sea-clutter  (temporal)  covariance  matrix  estima¬ 
tion.  Adaptive  Weiner  filter  trained  by  the  spatially  processed  data 
is  then  applied  to  contaminated  (conventionally)  beamformed  data 
with  similar  energetic  content  of  Doppler  spectra. 
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Figure  2:  Man-made  impulsive  noise,  beam  4. 


Figure  3:  “Clean”  data  used  for  comparison. 
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Figure  5:  Comparison  of  conventional,  SAP  and  optimization  beamformer. 


Figure  6:  Power  distribution  accross  beams,  top  convetional  beamformer,  bottom  SAP. 


Figure  8:  Weiner  prediction  filter  trained  on  beam  7,  used  on  beam  4. 
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ABSTRACT 

Broadband  processing  is  an  important  part  of  the 
Navy’s  current  and  future  SONAR  systems.  This  paper 
provides  an  introduction  to  a  new  class  of  passive 
broadband  processing  algorithms.  Subband  Energy 
Detection  (SED),  which  includes  both  Subband  Peak 
Energy  Detection  (SPED)  and  Subband  Extrema  Energy 
Detection  (SEED).  It  will  be  shown  that  SED  has  several 
performance  advantages  over  Conventional  Energy 
Detection  (CED),  also  known  as  Linear  Rectify  (LR). 

SED  exploits  the  spatial  coherence  of  the  signal’s 
local  maxima  (“peaks”)  and  minima  (“valleys”)  compared  to 
the  randomness  of  noise  to  increase  the  quality  of  the 
broadband  processing  display.  Instead  of  summing  the 
energy  in  each  single  beam  over  the  frequency  band,  SED 
sums  the  energy  of  the  peaks  and  valleys  in  the  azimuth 
spectrum  for  each  frequency  bin. 

The  objective  of  this  paper  is  to  examine  the 
theory,  advantages,  and  limitations  of  Subband  Energy 
Detection.  In  doing  so,  we  will  first  give  an  overview  of 
broadband  processing  and  discuss  energy  detection 
theory.  We  will  then  describe  the  theory  of  both  CED  and 
SED.  Processed  data  from  both  sets  of  algorithms  will  then 
be  analyzed  to  uncover  the  relative  advantages  and 
disadvantages  of  each  method. 


1.  INTRODUCTION 

For  a  single  time  scan,  the  output  of  the 
beamformer  is  a  2dimensional  matrix  in  frequency  and 
azimuth  known  as  a  FRAZ  (FRequency  AZimuth).  A  2- 
dimensional  FRAZ  cell  contains  a  measurement  of  the 
energy  for  each  azimuthal  and  frequency  bin.  A  typical 


example  FRAZ  is  shown  in  Fig.  la.  Broadband  processing 
methods  collapse  the  FRAZ  over  frequency  to  a  single 
dimension,  azimuth.  The  result  is  a  bearing-time  record 
(BTR)  display,  Fig.  lb,  which  allows  the  operators  to  detect 
contacts  and  provides  a  high  level  of  situational 
awareness. 

In  the  past,  broadband  detection  methods  such  as 
CED  and  cross-correlation  (CC)  have  provided  this  critical 
function  while  attempting  to  maximize  the  operator’s 
detection  ability.  Recently,  a  new  class  of  broadband 
detection  methods.  Subband  Energy  Detection  (SED),  has 
been  developed  and  has  emerged  as  an  accepted 
alternative  [1]. 

2.  ENERGY  DETECTION 

The  goal  of  energy  detection  methods  is  to  create 
an  estimate  of  the  probability  of  detection  of  an  acoustic 
source  at  a  given  time  and  location.  This  requires  the 
reduction  of  the  beamformer  output,  which  is  a  function  of 
time,  azimuth,  and  frequency  into  the  time-azimuth  plane. 
As  a  result,  both  CED  and  SED  collapse  the  beamformer 
output  over  frequency  but  each  takes  a  different  approach. 

2.1  Acoustic  Environment 

The  ocean  acoustic  environment  consists  of 
acoustic  energy  from  both  contact  signals  and  random 
noise.  This  noise  field  is  the  result  of  a  large  number  of 
factors  such  as  wave  action,  seismic  events,  marine  life, 
and  distant  shipping  activity. 

Since  this  noise  field  is  a  collection  of  sources,  it 
also  has  a  certain  level  of  directionality  associated  with  it 
The  attenuation  factor  for  acoustic  waves  is  also  larger  for 
higher  frequencies.  The  result  is  a  noise  field  dominated  in 
power  by  low  frequency  spectral  content  and  significantly 
less  high  frequency  content. 
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Figure  1:  Broadband  Processing  Methodology  (A)  FRAZ-  FRequency  Azimuth  plot  for  a  single  time  scan,  (B)  BTR- 
Bearing  Time  Record  Display  that  is  the  final  broadband  display. 


2.2  Normalization 

Due  to  the  nature  of  the  noise  field,  energy 
detection  methods  typically  utilize  a  noise  floor  estimate. 
This  is  done  since  signal  to  noise  ratio  (SNR)  is  used  as  the 
energy  value.  It  has  been  shown  that  the  use  of  SNR 
versus  raw  signal  typically  increases  the  performance  of 
the  algorithm.  Simply  summing  the  raw  energy  in  each 
frequency  bin  ignores  the  fact  that  low  frequencies 
dominate  the  energy  distribution.  Doing  so  may  prevent 
the  detection  of  primarily  high  frequency  contacts. 

Energy  detection  methods  with  noise  floor 
estimation  have  demonstrated  good  detection  capability 
including  the  detection  of  low  SNR  contacts  (i.e.  signals 
quieter  than  the  average  noise  floor)  [2].  In  part,  this 
detection  capability  benefits  from  two  primary  concepts: 
spatial  coherence  and  sidelobe  rejection. 

2.3  Energy  Detection  Concepts 

Spatial  coherence  is  defined  as  the  alignment  of 
distinct  frequency  components  of  a  contact  signal.  Since 
the  frequency  components  spatially  align,  they  strengthen 
the  energy  estimate  and  increase  the  detectability  of 
contact  sgnals  over  random  noise. 

Energy  detection  methods  also  provide  inherent 
sidelobe  rejection.  The  reason  for  this  is  related  to  the 
beamforming  process.  Beamforming  spatially  filters  the 
elemental  array  timeseries.  Ideally,  there  is  a  unity  gain  in 
the  look  direction  and  a  zero  gain  in  all  others.  Realistically, 
the  array  gain  pattern,  or  beam  pattern  includes  a  mainlobe 
of  a  certain  width  and  several  sidelobes  which  allow  noise 
and  interferer  energy  to  leak  into  the  beam  measurement. 


At  high  frequencies,  the  beam  pattern  has  a 
narrow  mainlobe  and  many  narrow  sidelobes.  As  the 
frequency  is  reduced,  the  lobe  width  increases  and  the 
location  of  the  sidelobe  peaks  shift  in  azimuth.  The  result  is 
that  for  a  single  beam  measurement,  the  mainlobe  peaks 
line  up  in  tie  same  azimuth  bin  for  all  frequencies  while 
sidelobe  peaks  spatially  shift  and  will  not  line  up  over  the 
frequency  range.  This  mitigates  the  effect  of  sidelobe 
energy  leakage. 


3.  CONVENTIONAL  ED 

Conventional  Energy  Detection  (CED),  also 
known  as  Linear  Rectify  (LR),  is  a  traditional  energy 
detection  method.  CED  will  be  utilized  as  a  baseline  for 
evaluating  Subband  Energy  Detection  (SED)  performance. 

3.1  CED  Principles  of  Operation 

CED  starts  with  a  FRAZ  for  a  single  time  scan  and 
processes  each  azimuth  bin,  Fig  2a.  A  single  azimuth  bin 
contains  a  frequency  spectrum  of  signal  plus  noise,  as 
seen  in  Fig.  2b.  As  mentioned  above,  the  next  step  is  to 
perform  a  noise  estimate.  The  method  used  by  CED  for 
estimating  the  noise  floor  applies  a  median  filter  in 
frequency  and  azimuth. 

CED  then  calculates  the  signal  to  noise  ratio 
(SNR)  by  dividing  the  beamformer  output  (signal  plus 
noise).  Fig.  2b,  by  the  noise  floor  estimate.  Fig  2c.  Finally,  it 
calculates  an  energy  estimate  by  summing  the  SNR  values 
in  all  desired  frequency  bins  for  the  single  azimuth  bin. 
This  process  is  repeated  for  each  azimuth  bin  and 
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Figure  2:  Conventional  Energy  Detection  (A)  FRAZ  display  with  arrow  showing  a  single  azimuth  bin,  (B)  Frequency 
spectrum  of  the  measured  signal  plus  noise  for  a  single  azimuth  bin,  (C)  Frequency  spectrum  of  the  normalized  noise 
estimate. 


every  time  scan  to  produce  a  BTR  display  which  is  used  to 
detect  acoustic  contacts. 

3.2  CED  Performance 

CED  has  been  shown  to  provide  optimal  single 
signal  detection  in  uncorrelated  noise  fields.  The 
theoretical  minimum  detectable  level  (MDL)  of  CED  for  this 
case  is  better  than  that  of  the  SED  algorithms  presented 
next.  As  such,  CED  provides  raw  optimum  detection  ability 
for  isolated  signals. 

There  is,  however,  one  major  limitation  of  CED. 
CED  produces  wider  contact  traces  due  to  the  limited 
bearing  resolution.  As  a  result,  CED  is  not  optimal  for  real 
world  acoustic  environments  with  multiple  signals.  This 
produces  BTR  displays  with  wide,  blurry  traces  for  loud 
contacts. 

The  detection  ability  of  the  system  for  cluttered, 
real  world  acoustic  noise  environments  is  impaired  since 
the  wide,  blurry  traces  may  suppress  nearby,  quieter 
contacts.  So,  despite  the  theoretical  MDL  advantage  of 
CED  for  isolated  signals,  SED  has  an  overall  detection 
advantage  in  clutter  due  to  the  increased  bearing 
resolution  and  narrower  contact  traces.  This  can  be  seen  in 
the  results  in  Fig.  5  and  will  be  discussed  further  later. 


4.  SUBBAND  ED 

Subband  Energy  Detection  (SED)  is  a  new  class  of 
energy  detection  methods.  These  algorithms  have  gained 
acceptance  and  are  currently  used  in  real  world  SONAR 
systems. 


4.1  SED  Principles  of  Operation 

SED  starts  with  the  same  FRAZ  information  as 
CED.  However,  instead  of  looking  at  the  frequency 
spectrum  in  a  single  beam,  SED  looks  at  the  azimuth 
spectrum  for  a  single  frequency  bin.  Fig  3a.  SED  finds  the 
locations  of  all  “peaks”  and  “valleys”  in  the  azimuth 
spectrum  for  each  frequency  bin.  An  example  azimuth 
spectrum  is  seen  in  Fig.  3b.  A  peak  b  simply  a  local 
maximum  in  azimuth  and  a  valley  is  a  local  minimum  in 
azimuth  These  peaks  and  valleys  are  then  used  to  generate 
an  energy  estimate  using  one  of  several  algorithms.  Fig.  4 
shows  BTRs  for  a  real  acoustic  data  set  processed  by  each 
of  the  four  primary  SED  algorithms. 

4.2  SPED  and  SEED 

There  are  two  fundamental  classes  of  Subband  ED 
algorithms:  Subband  Peak  Energy  Detection  (SPED)  and 
Subband  Extrema  Energy  Detection  (SEED).  In  addition, 
each  class  has  at  least  one  version  from  two  modes:  Clutter 
Suppress  (CS)  and  Energy  Detection  (ED). 

SPED  utilizes  only  the  peak  information  to 
estimate  the  detection  probability.  It  examines  the  azimuth 
spectrum  for  every  frequency  bin  and  locates  the  peaks. 
For  each  azimuth  bin  containing  a  peak,  a  value,  or 
“reward”,  will  be  added  to  the  energy  estimate  for  that 
azimuth  bin.  The  actual  value  of  the  reward  will  depend  on 
the  mode  of  the  algorithm  (i.e.  CS  or  ED).  This  is  repeated 
for  each  frequency  bin. 

Unlike  with  CED  processing,  if  the  bin  does  not 
contain  a  peak  then  SPED  will  not  add  to  the  energy 
estimate  for  that  azimuth.  In  other  words,  SPED  sums  only 
the  energy  at  the  peaks. 


Frequency 


Figure  3:  Subband  Energy  Detection  (A)  FRAZ  display  with  arrow  showing  a  single  frequency  bin,  (B)  Azimuth  spectrum 
of  a  single  frequency  bin  showing  the  signal  peaks  and  the  noise  floor. 


Subband  Extrema  Energy  Detection  utilizes  both 
peak  and  valley  information  to  estimate  the  detection 
probability.  Like  SPED,  it  will  add  a  reward  for  peaks.  In 
addition,  it  will  also  subtract  a  value,  or  assess  a  “penalty” 
for  any  valley  that  is  located  in  an  azimuth  bin. 

43  CS  and  ED  Modes 

The  clutter  suppress  mode  (CS)  assigns  a  reward 
and  penalty  of  unity  for  each  peak  and  valley.  This  mode 
can  be  thought  of  as  a  histogram  and  basically  counts  the 
number  of  peaks  (and,  in  the  case  of  SEED,  subtracts  the 
number  of  valleys).  It  does  not  attempt  to  account  for  the 
magnitudes  of  these  peaks  and  valleys.  As  a  result,  the  CS 
mode  does  not  require  noise  floor  estimation.  This  method 
works  well  with  broadband  contacts  but  poorly  with 
contacts  containing  only  a  few  loud  frequency 
components. 

The  energy  detect  mode  assigns  a  reward  and 
penalty  based  on  signal  to  noise  ratio.  This  requires  the 
calculation  of  a  noise  estimate.  The  reward  is  simply  the 
measured  beam  noise  (signal  plus  noise)  divided  by  the 
noise  estimate.  The  penalty  calculation  is  less 
straightforward  and  is  an  area  of  current  research  [3,4,5]. 

The  noise  estimate  typically  used  is  a  complex 
algorithm  that  averages  over  time,  clips  tonals,  applies  a 
smoothing  filter,  and  then  takes  the  quiet  value  in  an 
azimuth  sector  as  the  noise  floor. 

4.4  SED  Theory 

Peaks  and  valleys  occur  due  to  both  contact 
signals  and  random  noise.  Even  when  the  average  noise 
floor  is  greater  than  the  contact,  the  fluctuations  of  the 


noise  may  cause  it  to  drop  below  the  contact  signal.  When 
this  happens,  there  is  a  peak  due  to  the  contact  signal. 

In  one  frequency  bin  of  the  beam  noise  versus 
azimuth  spectrum,  there  may  be  several  peaks  due  to  the 
signal  but  still  many  more  due  to  noise.  Although  noise 
peaks  outnumber  signal  peaks,  low  SNR  contacts  may  still 
be  detected  because  peaks  due  to  contact  signals  will  have 
spatial  coherence  (i.e.  occur  in  the  same  azimuth  bin  for 
each  frequency)  while  noise  peaks  will  not.  As  a  result, 
these  signal  peaks  add  “constructively”  when  summed 
over  the  entire  range  of  frequency  bins. 

SED  is  often  referred  to  as  a  “peak-picking” 
method.  Instead  of  summing  the  energy  in  every  frequency 
bin,  SED  sums  only  the  energy  values  for  the  bins  that 
contain  extrema.  In  effect,  this  detects  only  the  peak  of  the 
mainlobe,  reduces  the  width  of  the  contact  traces,  and 
provides  increased  spatial  resolution  of  the  BTR  display. 
This  serves  to  provide  SED  with  a  detection  advantage 
over  CED  in  cluttered  environments  since  quiet  contacts 
are  no  longer  hidden  by  nearby  louder  ones. 

5.  RESULTS 

Fig.  5  shows  four  acoustic  data  sets  processed  by 
both  CED  and  SEED  CS.  The  first  example  (on  the  left) 
shows  comparable  detection  ability.  Despite  the  better 
theoretical  MDL  of  CED  for  isolated  targets,  this  and  most 
other  real  data  sets  show  no  appreciable  difference  in 
detection  ability. 

The  peak-picking  provides  SEED  CS  with  sharper, 
more  clearly  defined  contact  traces  as  can  be  seen 
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Figure  5:  BTR  displays  for  Conventional  ED  (top  row)  and  Subband  Extrema  Energy  Detection-  Clutter  Suppress  Mode, 
SEED  CS,  (bottom  row)  for  four  real  acoustic  data  sets. 
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in  the  second  example  from  the  left.  This  reduces  the 
‘blacked  out’  areas  resulting  from  loud  contacts.  In  this 
example,  the  increased  spatial  resolution  does  not  improve 
performance  substantially  since  both  grams  contain  all 
traces. 

In  the  third  example  from  the  left,  which  shows  a 
cluttered  environment,  the  increased  spatial  resolution 
does  provide  a  significant  detection  advantage.  Traces 
that  are  blurred  together  in  the  CED  gram  can  clearly  be 
seen  in  the  SEED  CS  gram. 

The  final  example  again  shows  the  detection 
advantage  of  SED  in  cluttered  noise  environments.  It  also 
shows  comparable  detection  performance  for  the  contact  of 
interest,  the  high  bearing  rate  trace  at  the  bottom. 

6.  SUMMARY 

This  paper  has  compared  the  theory  and  results  of 
both  Conventional  Energy  Detection  and  Subband  Energy 
Detection.  The  results  have  shown  that  SED  provides 
narrower  contact  traces  and  increased  bearing  resolution 
since  only  the  energy  of  the  peaks  and  valleys  are  summed. 
There  is  also  reduced  smearing  of  acoustic  energy  over 
large  azimuths  and  an  improved  ability  to  detect  nearby 
contacts.  Additionally,  despite  a  lower  theoretical  MDL  for 
isolated  signals,  SED  displays  a  significant  detection 
advantage  in  real  world  (cluttered)  acoustic  environments. 
The  overall  conclusion  is  that  Subband  Energy  Detection 
is  an  important  broadband  processing  method  that 
provides  increased  performance  to  Navy  SONAR  systems. 
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1.  INTRODUCTION 

Detection  of  a  signal  embedded  in  interference  is  a  common 
problem  encountered  in  radar,  sonar  and  communication 
systems.  In  cases  where  it  is  known  that  the  interference 
is  low  rank  (or  approximately  so)  the  amount  of  data  re¬ 
quired  for  adaptation  can  be  reduced  by  using  reduced  rank 
estimation  methods.  Three  proposed  methods  for  making 
the  selection  of  basis  vectors  are  the  Cross  Spectral  Met¬ 
ric  (CSM)  [1]  method,  the  Principal  Components  Inverse 
(PCI)  [2]  method  and  Multistage  Wiener  Filter  (MWF)  [3]. 
The  examination  here  is  for  detection  of  a  signal  that  may  or 
may  not  be  present  within  a  given  set  of  data.  That  is,  train¬ 
ing  and  signal  detection  must  be  performed  using  the  same 
data  set.  The  case  of  independent  training  and  test  data  has 
been  treated  in  [4, 5]. 

2.  ADAPTIVE  CSM,  MWF  AND  PCI 

The  methods  of  CSM,  PCI  and  the  MWF  offer  differing 
ways  of  providing  an  adaptive  processor  in  the  signal  based 
coordinates, 

\SHX  -  WqSLC(Bh X)\  (1) 

where  S  is  orthogonal  to  the  columns  of  B.  Referring  to 
Figure  1,  given  a  set  of  K  data  vector  samples,  form  a  1  x  K 
vector  d  =  [  di  d2  •  *  *  cLk  ]  and  a  N  —  1  x  K  matrix 
Z  =  [  Zi  Z2  •  *  *  Zk  ]  =  US from  the  data  in 
the  signal  and  orthogonal  space  respectively.  Subsequently, 
we  will  use  the  ~  symbol  to  denote  estimation  from  data. 
For  example,  R z  is  the  the  covariance  matrix  of  the  vector 
Zk  and  R z  is  an  estimate  of  Hz  using  a  finite  number  of 
vector  samples. 

Adaptive  versions  of  CSM,  PCI  and  the  MWF  can  be 
constructed  by  using  covariance  and  cross-covariance  esti¬ 
mates. 

R x  =  £ZZ*  Tiz  =  X  zdH  (2) 

For  CSM  and  PCI  the  weight  vector  is  formed  as 

Wgslc  =  UpE;2U  fro  (3) 


using  the  p  singular  vectors  and  values  selected  by  the  given 
method.  Adaptive  CSM  uses  the  estimated  cross  spectral 
metric;  whereas,  PCI  uses  only  the  estimated  singular  val¬ 
ues  to  determine  which  singular  vectors  are  kept  The  MWF 
uses  the  estimated  quantities  in  place  of  the  known  quan¬ 
tities  in  the  construction  of  the  multistage  decomposition. 
It  should  be  noted  that  the  philosophy  in  the  development 
of  these  methods  have  differences.  CSM  and  MWF  were 
formulated  as  a  rank  reduction  for  a  prescribed  rank  and 
general  covariance;  whereas,  PCI  was  developed  with  the 
assumption  that  the  covariance  is  from  a  low  rank  process 
and  the  rank  is  estimated  ffom  data  over  the  adaptation  in¬ 
terval  [6]. 

3.  TRAIN  AND  TEST  ON  SAME  DATA 

Often  the  scenario  is  such  that  the  calculation  and  applica¬ 
tion  of  the  weight  vector  is  to  be  performed  on  the  same 
data  set.  In  this  case,  with  no  signal  present,  CSM  has 
been  shown  to  be  the  optimal  method  with  respect  to  mean 
square  error  for  the  selection  of  singular  vectors,  and  an  up¬ 
per  bound  to  the  performance  of  PCI.  Consider,  the  assump¬ 
tion  that  the  interference  has  a  correct  rank  such  that  CSM 
and  PCI  should  nominally  choose  the  same  singular  vectors. 
Suppose  the  realization  of  data  has  a  swap  [4]  in  the  singu¬ 
lar  vectors  chosen  by  CSM.  Then  CSM  will  choose  a  set 
of  singular  vectors  which  may  not  be  best  for  the  entire  set 
of  all  possible  realizations,  but  that  doesn’t  matter  since  the 
weight  vector  will  only  be  used  on  this  realization.  PCI  on 
the  other  hand,  chooses  singular  vectors  which  should  work 
best  on  all  possible  realizations  and  thus  does  not  perform 
as  well  as  CSM  on  this  particular  realization.  That  is,  the 
reasoning  that  allows  PCI  to  outperform  CSM  in  the  inde¬ 
pendent  data  case  is  the  reason  that  it  is  poorer  in  the  same 
data  case  with  respect  to  mean  square  error. 

3.1.  Toy  Example 

Let  us  construct  a  simple  concrete  example  which  can  be 
used  to  highlight  the  characteristics  of  each  method.  As¬ 
sume  a  five  element  array  with  four  sample  snapshots  and 
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two  jammers.  Without  loss  of  generality  assume  that  the 
beams  of  the  orthogonal  space  are  chosen  to  be  the  eigenco- 
ordinates.  Choose  an  equal  power  of  1000  for  the  jammers 
and  let  the  signal  channel  be  given  as  d*  =  0.01Z1)fc  + 
0.1Z2fjb  +  0Z35*  -h  0Z4,jfc.  Suppose  the  data  with  only  jam¬ 
ming  (no  signal  or  background  noise)  is 


d  =  1 
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The  cross  covariance  of  this  data  is 

rtz 

=  [  10000  100000  0 

of 

(6) 

and  the  cross  correlation  is  given  by 

PdZ 

=  [  0.0995  0.9950  0 

o]T 

(7) 

Clearly  all  three  methods  will  choose  identical  subspaces 
and  thus  have  identical  performance.  Let  us  now  include 
the  effects  of  background  noise,  N,  in  the  orthogonal  beams 
such  that  Z  +  N  is  now  given  as 
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The  cross  covariance  for  this  situation  has  now  become 

Tdz  =  [  10094  100038  -2.5  0.75  ]T  (9) 
and  the  cross  correlation  has  changed  to 

Pdz  =  {  0.1004  0.9951  -0.0315  0.0201  ]T  (10) 

The  addition  of  the  background  noise  has  caused  some  per¬ 
turbation  but  not  sufficient  to  cause  PCI  and  CSM  to  dis¬ 
agree  on  the  singular  vector  selection.  The  first  MWF  basis 
vector  will  also  have  a  negligible  change  since  10094  >> 
2.5.  Let  us  now  introduce  a  signal  level  of  100  in  the  first 
snapshot  such  that  the  signal  channel  data  is  given  by 

d  =  [  10  90  110  -110  ]  (11) 

The  cross  covariance  has  now  changed  to 

rtz  =  [  35101  75088  -27.5  13.25  ]T  (12) 
and  the  cross  correlation  has  changed  to 

Pdz  =  [  0.3898  0.8341  -0.3865  0.3970  ]T  (13) 

Recall  that  the  PCI  choice  of  subspace  is  only  determined 
by  the  power  of  each  eigenchannel  and  so  is  unaffected  by 


the  presence  of  signal.  Although  the  values  of  the  cross  co- 
variance  have  changed  quite  a  bit,  the  first  two  channels  are 
still  the  dominant  values  and  the  two  dimensional  subspace 
chosen  by  the  MWF  will  only  slightly  be  affected.  CSM  on 
the  other  hand  has  undergone  a  change  in  its  choice  of  sub¬ 
space  due  to  perturbations  in  the  correlation  as  a  result  of  the 
signal  presence.  CSM  will  now  choose  channels  2  and  4  as 
opposed  to  1  and  2.  Thus  one  would  now  expect  decreased 
jammer  suppression  as  well  as  increased  signal  cancellation. 
The  fact  that  the  MWF  uses  the  cross  covariance,  which  is  a 
combination  of  power  and  correlation  makes  it  less  succep- 
tible  to  perturbation  by  introduction  of  signal  in  a  jamming 
environment. 

Consider  now  the  cases  when  the  estimation  of  the  rank 
is  greater  than  the  true  rank.  The  PCI  method  will  choose 
basis  vectors  which  contain  residual  power  due  to  errors  in 
the  estimation  of  the  true  interference  subspace  but  will  be 
unaffected  by  the  presence  of  signal.  The  CSM  method  will 
continue  to  choose  singular  vectors  based  on  the  signal  per¬ 
turbed  values  of  the  cross  correlation.  Once  the  interference 
is  essentially  canceled,  CSM  is  choosing  the  singular  vec¬ 
tor  for  which  the  noise  can  best  be  used  to  cancel  signal 
and  thus  will  suffer  a  loss  of  performance.  For  the  MWF, 
the  subspace  selection  for  ranks  at  or  below  the  interference 
rank  provides  good  subspace  estimation  although  some  per¬ 
turbation  due  to  signal  presence  does  exist.  However,  the 
strength  of  the  interference  in  the  well  estimated  subspace 
will  dominate  the  calculation  of  the  weight  for  that  stage. 
Once  the  interference  has  been  suppressed,  the  MWF  will 
then  construct  the  next  basis  vector  from  the  residual  noise 
in  an  attempt  to  cancel  out  the  signal  channel.  Unlike  CSM 
which  can  only  choose  between  singular  vectors  based  upon 
correlation,  the  MWF  utilizes  correlation  in  the  construction 
of  the  basis  vector.  The  MWF  will  therefore  suffer  signifi¬ 
cant  signal  cancellation  once  the  rank  is  overestimated. 

3.2.  Single  Jammer  Simulation 

Simulations  for  the  same  training  and  test  data  were  run  us¬ 
ing  a  signal  plus  noise  to  average  noise  criteria.  The  signal 
used  was  a  single  snapshot,  random  phase  signal  at  broad¬ 
side  to  the  array.  Placing  the  signal  only  in  a  single  snapshot 
is  done  without  loss  of  generality  since  it  does  not  statisti¬ 
cally  change  the  SVD  or  cross  covariance  and  cross  corre¬ 
lation  estimates. 

A  set  of  16  signal  free  snapshots  was  created  and  fil¬ 
tered  using  the  PCI,  CSM  and  MWF  methods  followed  by 
matched  filtering  in  time  with  a  squaring  of  the  output.  The 
signal  was  then  added  to  this  set  of  snapshots  and  the  filter¬ 
ing  process  was  repeated  on  the  signal  plus  noise  data.  The 

ratio  WeXnI§ovt)  was  comPuteci  for  each  method.  The 
first  set  of  simulations  is  performed  with  only  one  jamming 
signal  present  near  a  null  at  22  degrees  and  20dB  JNR  as 
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shown  in  Figure  2. 

The  results  for  PCI  and  CSM  are  plotted  as  scatter  plots 
for  varying  levels  of  input  SNR  in  Figure  7.  Each  dot  repre¬ 
sents  an  X-Y  plot  of  the  results  of  two  methods  to  an  identi¬ 
cal  input.  The  upper  left  Figure  is  for  the  case  of  no  signal. 
The  dots  are  scattered  nearly  symmetrically  about  the  diag¬ 
onal  with  the  two  methods  rarely  producing  the  same  result. 
There  does  appear  a  slight  bias  of  the  scattering  towards 
PCI  which  one  would  expect  since  CSM  provides  the  min¬ 
imum  mean  squared  error.  Since  the  jammer  level  is  20dB, 
the  PCI  choice  of  basis  vector  should  be  nearly  constant. 
Therefore,  the  disagreement  of  the  methods  is  a  result  of 
the  varying  CSM  choice  [4].  Again,  from  the  perspective 
of  mean  squared  error,  the  choices  are  optimum.  In  the  plot 
at  the  top  right,  a  signal  at  OdB  is  included  in  the  data.  An 
increase  in  the  shift  of  the  data  to  the  PCI  side  of  the  diag¬ 
onal  is  evident.  Increasing  the  signal  level  to  !2dB  in  the 
lower  left,  the  vast  majority  of  the  disagreements  between 
PCI  and  CSM  result  in  a  higher  output  jye'a'^Nout)  for 
the  PCI  method.  When  the  signal  level  is  raised  to  24dB, 
as  shown  in  the  lower  right  plot,  essentially  all  the  disagree¬ 
ments  of  the  two  methods  favors  the  PCI  method. 

Scatter  plots  for  PCI  and  the  MWF  are  plotted  in  Fig¬ 
ure  12.  For  the  cases  of  no  signal  and  OdB  signal  (top  left 
and  top  right  respectively)  the  two  methods  produce  sim¬ 
ilar  results  spread  around  the  diagonal.  Recall  that  since 
PCI  and  the  MWF  construct  the  basis  vectors  differently,  the 
agreement  along  the  diagonal  would  not  be  exact  as  in  the 
case  of  PCI  and  CSM.  When  the  signal  level  is  increased  to 
12dB  and  24dB  (bottom  left  and  bottom  right  respectively) 
a  slight  advantage  for  the  PCI  method  is  created. 

In  Figure  1 3  the  mean  (  mean(n)  )  is  Plotted  as  a  func¬ 
tion  of  the  signal  strength  for  each  method.  The  three  meth¬ 
ods  are  nearly  identical  at  the  — 6dB  signal  level  but  the 
CSM  method  begins  to  show  a  drop  in  performance  relative 
to  PCI  and  MWF  for  signals  beginning  at  OdB.  The  differ¬ 
ence  in  the  methods  holds  nearly  constant  as  the  signal  level 
is  increased  past  6dB.  The  MWF  method  is  slightly  below 
the  PCI  for  larger  signal  levels  although  it  is  difficult  to  see 
on  the  plot. 

Let  us  now  examine  the  performance  as  a  function  of 
rank  in  Figures  14  and  15.  As  described  earlier,  the  perfor¬ 
mance  of  the  MWF  drops  significantly  once  the  rank  is  over¬ 
estimated.  The  CSM  method  shows  the  performance  loss 
for  the  correct  rank  and  drops  faster  than  the  PCI  method 
when  the  rank  used  is  above  the  true  rank. 

33.  Multiple  Jammers 

Now  consider  a  five  jammer  scenario.  The  pictograph  of  the 
scenario  is  plotted  in  Figure  16. 

The  scatter  plots  for  PCI  verses  CSM  are  shown  in  Fig¬ 
ure  21.  The  plots  resemble  those  of  the  single  jammer  case 


generalized  si  delobe  canceller 


Figure  1:  Generalized  Sidelobe  Canceler  Structure 


Figure  2:  Single  Jammer  in  Null  at  20dB  JNR 


Figure  3:  No  Signal  Figure  4:  OdB  Signal 


Figure  5:  12dB  Signal  Figure  6:  24dB  Signal 

Figure  7:  PCI  Vs  CSM 
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Rank 


Figure  10:  12dB  Signal  Figure  11:  24dB  Signal  Figure  15:  Perforalance  35  s  Function  of  Rank  for  24dB 

Signal  with  Single  Jammer 

Figure  12:  PCI  VsMWFji^ 


Figure  1 6:  Five  Jammer  Scenario 


Figure  1 3:  Performance  as  a  Function  of  Signal  Level 


Figure  14:  Performance  as  s  Function  of  Rank  for  OdB  Sig¬ 
nal  with  Single  Jammer 


Figure  19:  12dB  Signal  Figure  20:  24dB  Signal 

Figure  21:  PCI  Vs  CSM 


although  the  performance  difference  of  the  two  methods  has 
increased.  In  Figure  26,  the  scatter  plots  are  shown  for  PCI 
verses  the  MWF.  The  performance  difference  between  the 
two  methods  is  now  more  noticeable  for  the  higher  signal 
level  cases  than  was  apparent  with  the  single  jammer.  This 
results  from  the  fact  that  the  power  levels  in  the  last  two  or 
three  basis  vectors  chosen  by  the  MWF  are  not  nearly  as 
strong  as  the  first  two  or  three. 

The  view  of  the  performance  of  the  five  jammer  scenario 
verses  signal  level  is  plotted  in  Figure  27.  As  before,  the 
CSM  method  shows  a  performance  loss  even  for  low  sig¬ 
nal  levels.  As  the  signal  level  is  increased,  the  performance 
difference  also  increases  as  multiple  errors  in  the  choice  of 
basis  vectors  occur.  The  MWF  method  agrees  well  with  the 
PCI  method  up  to  a  signal  level  of  6dB  at  which  point  the 
performance  of  the  MWF  begins  to  lag  that  of  PCI.  The  per¬ 
formance  degradation  of  the  MWF  grows  as  the  signal  level 
increased. 

Turning  to  the  performance  verses  rank  for  a  0  dB  sig¬ 
nal  in  Figure  28,  one  first  notices  that  the  performance  of 
CSM  and  the  MWF  peaks  at  a  rank  of  4  as  opposed  to  5. 
Estimation  of  the  5th  basis  vector  is  corrupted  by  signal  and 
better  performance  results  from  only  using  4  basis  vectors. 
As  expected,  performance  for  ranks  below  the  number  of 
jammers  is  significantly  better  for  the  MWF  and  CSM  than 
the  PCI  method.  However,  once  the  rank  is  overestimated 
the  performance  of  the  MWF  and  CSM  decrease  rapidly  for 
reasons  discussed  previously.  The  signal  level  is  increased 
in  Figure  29  to  24dB.  Most  notable  in  this  plot  is  the  rela¬ 
tively  poor  performance  of  the  CSM  method  for  all  ranks. 


Figure  24:  12dB  Signal  Figure  25:  24dB  Signal 

Figure  26:  PCI  Vs  MWF 


Figure  27:  Performance  as  a  Function  of  Signal  Level  with 
5  Jammers 


Figure  28:  Performance  as  s  Function  of  Rank  for  OdB  Sig¬ 
nal  with  5  Jammers 


35 


Figure  29:  Performance  as  s  Function  of  Rank  for  24dB 
Signal  with  5  Jammers 


Figure  30:  Receiver  Operating  Characteristic  for  12dB  Sig¬ 
nal 


As  a  final  way  of  looking  at  the  performance  of  the 
methods,  the  Receiver  Operating  Characteristic  (ROC)  curves 
for  a  12dB  signal  level  are  plotted  in  Figure  30.  The  curve 
was  generated  using  2 15  samples.  The  performance  loss  of 
both  MWF  and  CSM  is  evident. 

Overall,  the  experiments  validate  the  insights  that  were 
gained  by  examining  our  toy  example. 

4.  CONCLUSIONS 

The  Cross  Spectral  Metric  (CSM)  and  the  Multistage  Wiener 
Filter  (MWF)  are  two  recently  introduced  alternatives  to  the 
Principal  Components  Inverse  (PCI)  method  of  rank  reduc¬ 
tion  for  adaptive  detection.  By  gaining  insight  into  the  pa¬ 
rameters  that  each  method  utilizes  and  the  estimation  char¬ 
acteristics  of  those  parameters,  one  can  predict  how  each 
method  will  perform  under  differing  scenarios.  PCI  selects 
basis  vectors  by  use  of  an  SVD  and  selects  a  subspace  based 
upon  singular  values.  The  subspace  of  the  SVD  is  stable  un¬ 
der  conditions  of  strong  power.  CSM  selects  basis  vectors 
by  use  of  an  SVD  but  then  selects  a  subset  based  upon  cor¬ 
relation  with  the  desired  channel.  Thus  the  basis  vectors  are 
chosen  with  respect  to  power  but  then  a  subset  is  selected  by 


use  of  the  cross  spectral  metric.  Since  singular  vectors  are 
not  necessarily  stable,  even  though  a  subspace  is,  CSM  has 
difficulty  since  the  metric  depends  upon  the  singular  vec¬ 
tors  rather  than  the  entire  subspace.  The  MWF  forms  and 
chooses  basis  vectors  based  upon  the  cross  covariance  with 
the  desired  channel  which  is  a  combination  of  power  and 
correlation.  By  creating  scenarios  where  the  estimates  of 
these  parameters  are  similar  to  those  obtained  by  the  back¬ 
ground  white  noise,  errors  in  the  selection  of  the  basis  vec¬ 
tors  can  be  made  to  occur.  These  errors  are  responsible  for 
decreases  in  the  detection  performance  of  the  methods. 
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Abstract  -  This  paper  investigates  the  optimization 
of  both  single  and  full  polarization  radar  transmission 
waveforms  to  maximize  target  identification  discrimina¬ 
tion.  This  theory  is  applied  to  the  discrimination  of  the 
T-72  and  Ml  battle  tanks  based  upon  simulated  target 
frequency  response  data.  Significant  performance  im¬ 
provement  in  identification  is  obtained  using  an  opti¬ 
mized  transmission  waveform  over  that  of  a  standard 
chirped  pulse. 

1  INTRODUCTION 

A  number  of  researchers  [1,  2,  3,  4,  5,  6,  7]  have  con¬ 
sidered  the  use  of  sophisticated  pulse  shaping  techniques  in 
order  to  maximize  the  radar  energy  reflected  off  of  a  non¬ 
point  target.  In  particular.  Grieve,  Guerci,  Pillai,  Oh,  and 
Youla,  [1,  2,  3]  have  developed  a  general  theory  of  opti¬ 
mized  pulse  shaping  that  maximizes  the  target  SINR,  in¬ 
cluding  the  effects  of  both  generic  colored  noise  and  col¬ 
ored  signal-dependent  clutter.  In  addition,  Guerci  and  Pillai 
[8,  9]  developed  the  theory  of  optimized  pulse  shaping  for 
single-channel  target  identification  discrimination  via  the 
use  of  techniques  that  are  similar  to  that  used  for  detection. 
This  paper  extends  this  target  discrimination  analyses  in 
permitting  multiple-channels,  colored  noise,  and  non-zero 
colored  clutter. 

The  present  analysis  applies  the  theory  of  optimized 
pulse  shaping  for  target  identification  discrimination  us¬ 
ing  two  simulated  surface  targets:  the  T-72  and  Ml  main 
battle  tanks.  S AlC-Champaign  [10]  generated  the  full- 
polarization  VHF-band  radar  signatures  for  a  single  eleva¬ 
tion  angle  of  15°  and  the  full  spectrum  0°  -  360°  of  aspect 
angles  relative  to  the  sensor.  These  VHF-band  data  were 


generated  using  the  Fast  Illinois  Solver  Code  (FISC)  that 
applies  a  method-of- moments  technique  to  provide  high  fi¬ 
delity  at  relatively  low  radar  frequencies.  The  specific  VHF- 
band  data  generated  by  SAIC-Champaign  cover  frequencies 
between  225-375  MHz  at  an  aspect  interval  of  2°. 

2  OPTIMIZED  SINGLE-POLARIZATION  TAR¬ 
GET  IDENTIFICATION 

The  derivation  begins  with  the  result  that  the  maximiza¬ 
tion  of  the  probability  of  correct  classification  between  two 
target  classes  a  and  (5  is  equivalent  [12,  13]  to  the  maxi¬ 
mization  of  the  square  of  the  Mahalanobis  distance 

rf  =  (y«  -  y/3)HR-1  ( ya  -  y& )  0) 

between  the  two  target  echoes.  Here,  ya  =  qQf  and 
yp  =  q^f  are  real-valued  vectors  of  length  2 N  —  1  giving 
the  temporal  samples  of  the  echoes  from  targets  a  and  /?,  re¬ 
spectively.  The  real- valued  vector  f  —  [/o>  /i?  •  •  • ,  //v-i]T 
gives  the  temporal  samples  of  the  transmission  pulse.  The 
real-valued  matrices  qa  and  are  the  convolution  impulse 
responses  for  targets  a  and  /?,  respectively,  having  the  form 
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The  (2JV  —  1)  x  ( 2N  —  1)  Hermitian-Toeplitz  matrix 


r° 

n 

’  T2N-2  \ 

R  = 

ri 

r0 

•  r2N-3 

\  r* 

'  ’2N—2 

r2N-3  ’ 

ro  / 

is  the  temporal  autocorrelation  of  the  noise  plus  clutter,  with 
matrix  coefficients 

re  =  £  JjGnM  +  Ge(u)\F(u)\a}^  dw.  (4) 

Thus,  rf  can  be  expressed  in  the  form 

V2  =  fHm,  (5) 

with  the  matrix  U  defined  by 

n  =  (qa-q^)HR_1(qo-q^)-  (6) 

For  the  case  of  zero  clutter  Gc{w)  =  0,  the  minimax  theo¬ 
rem  implies  that  the  maximization  of  rf  is  obtained  when 
the  transmission  pulse  vector  f  is  equal  to  the  eigenvector 
of  ft  corresponding  to  the  largest  eigenvalue. 

For  the  case  of  non-zero  clutter  Gc(uj)  ^  0,  the  au¬ 
tocorrelation  matrix  R  depends  upon  the  power  spectrum 
|F(u;)  |2  of  the  transmission  pulse  vector  f  via  Eqs.  (3)  and 
(4),  so  that  an  iterative  procedure  similar  to  that  used  for 
optimized  target  detection  [2]  must  be  applied,  as  described 
below: 


6)  Let  ffc+i  Ffc+i(a;)  and  go  back  to  Step  2  with  k 
replaced  by  k  +  1,  and  repeat  until  tk  is  sufficiently 
small.  Then  the  optimized  transmission  vector  is 

f  =  lim  ffc.  (9) 

k— ►oo 

Figure  1  gives  the  improvement  in  the  square  of  the  Ma- 
halanobis  distance  squared  between  the  T-72  and  the  Ml  at 
VHF-band  resulting  from  the  use  of  the  optimized  transmis¬ 
sion  pulse  over  that  of  a  standard  chirped  pulse.  This  figure 
shows  two  values  of  the  CNR:  0  and  10.  The  improvement 
in  the  square  of  the  Mahalanobis  distance  degrades  as  the 
CNR  level  is  increased,,  as  occurs  with  the  SINR  improve¬ 
ment  in  the  detection  problem  [2]. 

For  the  case  of  aspect  uncertainty,  it  is  necessary  to  com¬ 
pute  the  expected  value  of  the  square  of  the  Mahalanobis 
distance,  i.e., 

rp=  J  de  mv2(0)  =  j dJd  £(0)f*n(0)f,  (10) 

with  the  density  function  f(0)  characterizing  the  a  prior 
likelihood  of  the  target  aspect  0.  The  matrix  ft(6)  now  in¬ 
cludes  aspect  dependence,  i.e., 

fiW  =  (wa(0)  -  w^W)ffR-1(wa(^)  -  (11) 

Inserting  Eq.  (1 1)  into  Eq.  (10)  implies  that 

v*  =  fHnf.  (12) 

can  be  expressed  in  terms  of 


1)  For  the  initialization  k  =  0,  begin  with  any  real  causal 
temporal  vector  fo  of  duration  to  and  energy  Eq. 

2)  Let  ffc  <-►  Fk(u>)  and  find  the  corresponding  temporal 
autocorrelation  matrix  Rfc  using  Eqs.  (3)  and  (4). 

3)  Compute  the  f Ik  matrix  using  Eq.  (6)  in  terms  of  the 
autocorrelation  matrix  R^  and  the  target  impulse  re¬ 
sponse  matrix  q. 

4)  Find  the  largest  eigenvalue  \[k^  and  corresponding 
normalized  eigenvector  v[k^  of  the  ftk  matrix. 

5)  Define  the  error  at  stage  k  by 


n  =  J  ddi{e)  n(0).  (13) 

Thus,  optimization  of  the  transmission  waveform  to  maxi¬ 
mize  identification  performance  under  conditions  of  aspect 
uncertainty  involves  the  computation  of  the  weighted  aver¬ 
age  of  ft{0)  matrices  with  respect  to  aspect.  Furthermore, 
the  iterative  procedure  described  above  for  the  case  of  non¬ 
zero  clutter  is  modified  only  by  the  replacement  of  the  ft 
matrix  by  its  weighted  average  ft. 

3  OPTIMIZED  FULL-POLARIZATION  TARGET 
IDENTIFICATION 


(7) 

and  invoke  the  same  update  rule  that  is  applied  in  Pillai 
and  Guerci  [2] 


ffc+i 


ffc  +  frVi 


(fc) 


1  + 


vk)  - ( 7k ) 


(8) 


This  section  describes  the  theory  of  optimal  waveform 
transmission  and  reception  in  order  to  maximize  the  Ma¬ 
halanobis  distance  between  two  target  echoes  for  the  case 
of  a  single  full-polarization  waveform,  i.e.,  one  contain¬ 
ing  both  horizontal  and  vertical  components.  Consider  the 
2AMength  real-valued  transmission  signal  vector  and  cor¬ 
responding  frequency  response  vector 

f=(l)~F»(U  <>« 


38 


with  fv,  Fh  and  Fv  each  containing  N  temporal  sam¬ 
ples.  The  subscripts  h  and  v  denote  the  horizontal  and  ver¬ 
tical  channels,  respectively.  This  transmit  vector  is  further 
constrained  to  have  finite  energy  E0.  This  energy  constraint 
corresponds  to  the  case  in  which  the  sum  of  the  transmis¬ 
sion  energies  in  both  the  horizontal  and  vertical  channels 
are  fixed,  so  that  a  single  power  supply  supports  both  trans¬ 
mission  channels.  The  2 N  x  2 N  target  impulse  response 
matrix  and  corresponding  frequency  response  matrix  have 
the  form 

q=(qhh  qhv)  ^-*Q  =  f^kh  55* V  (M) 

V  Qvh  <lvv  /  \  Q vh  Qw  / 

The  target  echo  vector  has  the  form 

ss(j)=’f  <'« 

The  full-polarization  matrix 

r0  ri  •  rN_!\ 
ri  r0  ***  rjv-2 

:  :  ..  :  (17) 

tN-  1  rN— 2  •**  r0  / 

is  the  temporal  autocorrelation  of  the  noise  plus  clutter,  with 
the  2  x  2  sub-matrix  coefficients 

r e=^f  {G»M  +  Gf(w)}^  du.  (18) 

The  matrices  Gn(o ;)  and  Gjr(a;)  are  the  full-polarization 
spectral  densities  corresponding  to  the  noise  and  the  clutter, 
respectively.  The  total  clutter  power  spectral  density  has  the 
form 

GfM  =  GWl(«)|Ffc(a;)|2  +  (19) 

+  Gvh  (a;)  Fv  (u)F£  (a;)  Gvv  (a;)  |  Fv  (u)  |2  >  0  (20) 

The  optimization  of  the  transmission  vector  f  in  order  to 
maximize  the  square  of  the  full-polarization  Mahalanobis 
distance  gives 

7?2  =  maxfHfif.  (21) 

with  the  matrix  ft  defined  by 

n  =  {qa  -  q*}Hirl{<i»  -  q^}-  (22) 

For  the  case  of  zero  clutter  Gc(c*;)  =  0,  the  minimax  the¬ 
orem  implies  that  the  maximization  of  rj2  is  obtained  when 
the  transmission  pulse  vector  f  is  equal  to  the  eigenvector 
of  ft  corresponding  to  the  largest  eigenvalue.  For  the  case 
of  non-zero  clutter  Gc(u>)  ^  0,  the  autocorrelation  matrix 
R  depends  upon  the  full-polarization  power  spectrum  of  the 
transmission  pulse  vector  f  via  Eq.  (20),  so  that  the  iterative 


procedure  described  for  single-polarization  above  must  be 
applied. 

Figure  2  gives  the  full-polarization  waveforms  optimized 
to  maximize  the  Mahalanobis  distance  between  the  T-72 
and  the  M 1 ,  as  a  function  of  the  relative  aspect  angle  for  the 
case  of  white  noise  and  zero  clutter.  The  optimized  wave¬ 
form  typically  focuses  the  majority  of  its  energy  into  a  nar¬ 
row  frequency  band  corresponding  to  the  maximum  target 
response  at  that  aspect  angle.  Figure  3  demonstrates  that 
the  optimized  full-polarization  waveform  gives  an  improve¬ 
ment  of  1-5  dB  in  the  Mahalanobis  distance  over  that  ob¬ 
tained  from  the  transmission  of  a  full-polarization  chirped 
waveform. 

The  analysis  described  above  for  the  case  of  aspect  cer¬ 
tainty  can  be  extended  to  the  case  of  aspect  uncertainty  in  a 
manner  similar  to  that  performed  for  the  single-polarization 
case.  The  resulting  theory  requires  a  weighted-average  with 
respect  to  relative  aspect  angle  be  performed  on  the  auto¬ 
correlation  kernel  matrix  ft.  This  averaging  of  the  full- 
polarization  kernel  matrices  yields  a  smoothing  of  the  Ma¬ 
halanobis  curves,  as  was  obtained  in  the  single-polarization 
case. 

4  CONCLUSION 

This  study  investigates  the  optimization  of  a  single  trans¬ 
mission  pulse  shape  and  the  receiver  impulse  response  in 
order  to  maximize  the  probability  of  correct  identification 
between  two  target  classes.  The  optimization  of  the  trans¬ 
mission  pulse  shaping  in  order  to  maximize  target  identifi¬ 
cation  performance  that  was  developed  by  Guerci  and  Pillai 
[9]  is  extended  to  include  multiple  channels,  colored  noise, 
and  non-zero  colored  clutter.  These  extensions  [11]  for  the 
identification  problem  are  developed  via  a  maximization  of 
the  Mahalanobis  distance,  and  thus  the  probability  of  cor¬ 
rect  classification,  between  the  echoes  of  two  target  classes. 

This  study  applies  this  theory  [9]  and  extensions  of  op¬ 
timized  transmission  pulse  shaping  in  order  to  investigate 
the  maximization  of  the  probability  of  correct  identifica¬ 
tion.  Algorithmic  implementation  for  the  simulated  T-72 
and  Ml  frequency  response  data  at  both  single  and  multiple 
polarizations  of  the  VHF  frequency  band  reveals  significant 
improvements  in  the  Mahalanobis  distance  of  using  a  single 
optimized  waveform  over  that  of  a  standard  chirped  pulse. 
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Mahalanobis  Distance  Improvement  for  Various  CNR 


Figure  1.  This  plot  presents  the  Mahalanobis  distance 
improvement  at  VHF-band  with  respect  to  a  chirped 
transmission  waveform  resulting  from  use  of  a  trans¬ 
mission  pulse  shape  optimized  for  T-72  versus  Ml 
identification  discrimination.  The  two  curves  corre¬ 
sponding  to  CNR=  1  and  CNR=  10  are  plotted  as  a 
function  of  aspect  for  0°  —  360°. 
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Opt-  Waveforms  in  frequency  T72  -  Ml^  *  0,  CNR  *  0 


Opt.  Waveform*  1  n  f^^®cy ?0f;  %F-  M1^0  *  C.  CNR  »  0 


Aspect  Angle  {deg} 


Figure  2.  This  figure  gives  the  full-polarization  wave¬ 
forms  optimized  to  maximize  the  Mahalanobis  dis¬ 
tance  between  the  T-72  and  the  Ml,  as  a  function  of 
the  relative  aspect  angle  for  the  case  of  white  noise 
and  zero  clutter.  The  optimized  waveform  typically 
focuses  the  majority  of  its  energy  into  a  narrow  fre¬ 
quency  band  corresponding  to  the  maximum  target 
response  at  that  aspect  angle. 


Mahalanobis  Distance  Comparison:  T72  -  M1,o0  -  0,  CNR  =  0 


Figure  3.  This  figure  demonstrates  that  the  optimized 
full-polarization  waveform  gives  an  improvement  of  1- 
5  dB  in  the  Mahalanobis  distance  over  that  obtained 
from  the  transmission  of  a  full-polarization  chirped 
waveform. 
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ABSTRACT 

We  examine  the  problem  of  maximum  likelihood  covari¬ 
ance  estimation  using  a  sensor  array  in  which  the  relative 
positions  of  individual  sensors  change  over  the  observa¬ 
tion  interval.  The  problem  is  cast  as  one  of  estimating 
a  structured  covariance  matrix  sequence .  A  vector  space 
structure  is  imposed  on  such  sequences ,  and  within  that 
vector  space  we  define  a  constraint  space  given  by  the  in¬ 
tersection  of  a  hyperplane  Wx  and  the  space  of  sequences 
of  nonnegative  definite  matrices  W2.  Knowledge  of  the 
changing  array  geometry  is  used  to  reduce  the  dimension 
of  the  search  space ,  An  extension  of  the  inverse  iteration 
algorithm  of  Burg  etal  is  proposed  for  finding  the  maxi¬ 
mum  likelihood  solution, 

1.  INTRODUCTION 

In  many  array  signal  processing  applications  knowledge  of 
the  observation  covariance  matrix  is  essential.  Examples  of 
such  applications  include  MVDR  beamforming  and  direc¬ 
tion  of  arrival  estimation  using  MUSIC.  Many  algorithms 
for  estimating  the  covariance  matrix  are  available.  Perhaps 
the  simplest  and  at  the  same  time  most  common  is  given  by 

1  M 

*•  =  M  L  XmXm  (D 

m=l 

which  is  the  maximum-likelihood  estimate  given  identical, 
independent,  zero-mean  random  vectors  xm  with  covari¬ 
ance  R.  Other  estimators  incorporate  information  about  the 
array  geometry.  These  are  commonly  called  structured  co- 
variance  estimators  and  were  introduced  in  [1]. 

When  the  array  changes  shape  significantly  over  an  ob¬ 
servation  interval  the  statistics  of  the  data  vectors  change, 
however.  This  invalidates  the  identical  distribution  assump¬ 
tion  used  to  obtain  (1)  and  the  assumptions  of  most  struc¬ 
tured  covariance  estimation  algorithms.  The  phenomenon 
This  research  was  funded  by  a  grant  from  MIT  Lincoln  Laboratory. 


of  time-varying  arrays  of  sensors  exists  in  nearly  all  array 
applications.  (No  array  is  truly  time-invariant,  although  they 
may  be  close  enough  to  achieve  the  desired  performance.) 
The  effect  is  exaggerated,  though,  in  towed  sonar  arrays 
which  are  subject  to  underwater  currents  and  the  maneuver¬ 
ing  of  their  parent  platform.  An  array  of  sensors  in  which 
each  sensor  is  mounted  on  a  different  platform  with  its  own 
propulsion  also  constitutes  a  time- varying  array. 

Direction-of-arrival  and  spectrum  estimation  for  time- 
varying  arrays  has  been  addressed  by  a  number  of  authors. 
Direction-of-arrival  estimation  was  addressed  in  [2]  and  [3]. 
Fast  algorithms  for  doing  the  same  which  are  based  on  the 
eigenstructure  of  the  matrix  are  presented  in  [4].  In  [5] 
the  EM  algorithm  is  used  to  estimate  the  power  of  far-field 
sources  using  a  time- varying  array. 

In  this  paper  we  address  the  problem  of  maximimum 
likelihood  (ML)  covariance  estimation  for  time-varying  ar¬ 
rays.  We  proceed  by  defining  a  mathematical  infrastructure 
and  applying  commonly  used  linear  algebra  techniques.  We 
then  propose  several  search  algorithms  to  find  the  covari¬ 
ance  that  maximizes  the  likelihood  under  several  constraints 
imposed  by  the  array  motion.  What  results  may  be  consid¬ 
ered  a  time- varying  structured  covariance  estimation  algo¬ 
rithm. 

2.  DEFINITIONS 

Let  A  be  the  number  of  elements  in  the  array  and  p  n  (i)  € 
R3  be  the  position  of  the  nth  element  at  time  t.  Let  M 
denote  the  number  of  data  vectors  sampled  by  the  array  at 
times  {<1,^2? **•  it m }  with  sampling  frequency  Fs .  The 
mth  data  vector  we  represent  by  xm  which  is  a  normally 
distributed  complex  random  variable  with  mean  0  and  co- 
variance  R(fm)  =  R m.  The  time- varying  nature  of  the 
array  implies  that  Rm  need  not  equal  Rm+i-  The  problem 
is  therefore  to  estimate  Rm  for  m  =  1,  •  *  *  ,  M. 

We  make  two  assumptions  regarding  the  available  infor- 
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mation.  First,  the  N  x  1  steering  vector  a(0,  t)  is  known  for 
all  tm  and  for  all  0  €  S2.  The  nth  element  of  the  steering 
vector  is  given  by 


a n(€M)  =  exp 


kr(e)P  n(ty 
x 


(2) 


where  k(0)  is  the  unit  vector  associated  with  the  direction 
0.  Secondly,  the  signal  originating  at  any  direction  is  un¬ 
correlated  with  signals  originating  at  other  directions.  Also, 
the  sampling  rate  is  such  that  the  sampled  signals  are  in¬ 
dependent  random  variables.  The  time- varying  covariance 
matrix  is  then  given  by 


Rm=  f  cr2(Q)a(Q,tm)a.H(Q, tm)dQ  (3) 
JS 2 

where  cr2  (0)d©  is  the  time-invariant  power  of  the  differen¬ 
tial  emitter  at  location  0. 

Since  we  are  interested  in  a  sequence  of  Hermitian  ma¬ 
trices  let  us  introduce  the  following  notation: 


Definition  1  For  positive  integers  N  and  M,  let  Vn,m  be  a 
space  such  that  X  €  VNjM  implies  that  X  =  [Xi ,  •  •  ,  XM] 
where  Xm  e  CN*N  andX%  = 

Observe  that  we  denote  elements  of  this  space  by  capital, 
bold-faced  letters  with  an  overbar  and  the  mth  element  of 
the  sequence  by  the  same  letter  with  a  subscript.  For  some 
a  €  R  and  X  €  Vn,m  we  define  scalar  multiplication  as 

aX  =  [aXl5  *  -  •  ,oXm]  •  (4) 

Similarly,  addition  is  defined  element- wise,that  is  for  X,  Y  € 

Vn,m 

X  +  Y  =  [Xi  +  Yi,--  •  ,Xm  +  Yjif] .  (5) 

It  is  easy  to  see  that  under  these  operations  VNyM  is  a  vector 
space  over  R.  For  notational  convenience  we  also  define  the 
following  operations  on  vectors: 

XY  =  [XiYi,-*- ,XmYm]  (6) 

X  =  [Xx  1 ,  •  •  -  ,  x;  ]  .  (7) 

Notice  that  under  vector  addition  as  defined  in  (5)  and  using 
(6)  as  vector  multiplication,  V^,m  forms  a  non-commutative 
ring  .  The  multiplicative  identity  is  then  the  length-M  se¬ 
quence  of  AT  x  AT  identity  matrices  and  (7)  is  the  multiplica¬ 
tive  inverse  of  X.  With  this  in  mind,  it  would  be  appropriate 
to  refer  to  (6)  and  (7)  as  multiplication  and  inversion  respec¬ 
tively. 


We  claim  that  this  is  an  inner  product  on  Vn,m-  This  is  a 
result  of  the  following  facts  which  are  easily  proved  for  all 
aeR  and  X,  Y,  Z  £  Vn,m  • 

(X,Y)eR 

(X  +  Y,  Z)  =  (X,  Z)  +  (Y,  Z) 

(aX,  Y)  =  a(X,Y) 

<X,Y)  =  {Y,X) 

{X,  X)  >  0  with  equality  iff  X  =  [0.  *  •  *  ,  0] 

Therefore  (Vat,m7  {,.,))  is  an  inner  product  space. 

The  covariance  matrix  sequence  is  an  element,  R,  of 
Vn,m-  With  this  in  mind  we  can  rewrite  (3)  as 

R=  [  cr 2(0)$(0)d©  (8) 

Js 2 


where 

#(0)  =  {a(0,t1)aH(0,fi),--  -  ,a(0,fM)a/f(0,fM)}. 


The  span  of  #(0)  over  all  0  6  S2  is  a  vector  subspace  of 
Vj We  will  call  this  subspace  Wj.  It  is  clear  from  (8) 
that  R  6  W\ .  Being  a  vector  space,  W\  is  convex  and  there¬ 
fore  path-connected.  Furthermore,  there  exists  an  orthonor¬ 
mal  basis  for  Wx.  We  will  let  W2  6  VNtM  be  the  space  of 
all  length-M  sequences  of  non-negative  definite  Hermitian 
matrices.  Since  any  covariance  matrix  is  non-negative  def¬ 
inite,  R  €  W2.  It  can  be  shown  that  W2  is  also  convex. 
Since  the  desired  sequence  lies  within  both  subspaces  the 
constraint  space  is  their  intersection  W  =  Wx  C\W2.  As  the 
intersection  of  two  convex  sets,  W  is  also  convex. 

We  remark  that  the  set  of  matrix  sequences  W  may  not 
coincide  exactly  with  the  space  of  matrix  sequences  given 
by  the  model  in  (3),  although  as  has  been  shown  the  latter 
is  a  subset  of  W.  Our  constraint  space  may  contain  elements 
outside  of  the  cone  described  by  (3).  The  discrepancy  be¬ 
tween  the  two  spaces,  and  the  consequences  thereof,  remain 
open  questions. 

We  will  now  derive  the  log-likelihood  function  of  the 
covariance  matrix  sequence  for  the  given  data  set.  The  pdf 
of  the  data  vectors  is  defined  only  for  the  interior  of  W2: 


Definition  2  VX,  Y  €  VN)M  let 

M 

<X,Y)=  £tr(X*Ym). 


/(X I,'"  ,XM)  =  7T  NM 


x  exp 


(9) 
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The  log-likelihood  is  then 

M  M 

l(R-)  =  -  to  |Rm!  -  J2  tr  (XmRmxm) 

m— 1  m=l 

M  M 

=  -  Y,  IRml  -  £  ^  (^Sm)  (10) 

m— 1  m=l 

where  we  shall  call 

Sfn  —  XmXTO  (11) 

the  sample  covariance  matrix  at  time  tm.  Observe  that  Sm  = 
S£  and  therefore  the  length- M  sequence  of  all  such  matri¬ 
ces,  S,  is  an  element  of  We  can  use  the  notation  of 

definition  2  to  simplify  (10): 

M 

l(R;S)=-^ln|RTO|-{R-1,S).  (12) 

171=1 

To  find  the  gradient  of  the  log-likelihood  function  we 
will  make  use  of  several  differentiation  theorems  found  in 
[1]. 

Theorem  1  For  R,  <S>  e  CNx  N, 

-^-ln|R  +  x$|  =  tr(R-1$). 
ax 

Theorem  2  For  R.  S.  4>  e  CNxN, 

+  T*y[S)  =  —  tr(R-1$R-1S). 
ax 

The  directional  derivative  of  the  log-likelihood  along  the 
vector  4>  is 

d  d  m 

—1(R  +  t*:S)  =  £  In |R,„  +  ar$ro| 

m=  1 

-  tr((R„  +  x$TO)-1STO) 

St 

=  -  £  tr(R-‘4>„.)  -  trfR-^R"^) 

m=  i 

M 

=  -  £  tr(R„>r„  -  R-1SmR"1$m) 

m  —  1 
M 

=  -  £  »  ((Rm:  -  R"1SmR-1)$m) 

mr  1 

=  (R-'SR-1  -fir1,*).  (13) 

Therefore  the  gradient  of  the  log-likelihood  function  is  given 
by 

Vi(fi,  S)  =  R-1SR-1  -  R"1.  (14) 


Note  the  similarity  of  this  expression  to  the  analogous  ex¬ 
pression  for  the  gradient  of  the  log-likelihood  in  Burg  et  al 
[1].  Here  the  matrices  have  been  replaced  with  matrix  se¬ 
quences. 

3*  ESTIMATION  ALGORITHMS 

One  possible  estimator  is  the  projection  of  the  sample  co- 
variance  matrix  sequence,  S,  onto  the  constraint  space  W. 
This  is  equivalent  to  selecting  the  point  in  W  that  is  the 
closest  to  S  by  the  standard  distance  metric  for  inner  prod¬ 
uct  spaces: 

d(X,  Y)  =  <X-  Y,X-Y>i  (15) 

Because  of  the  similarity  between  this  estimator  and  clas¬ 
sic  filtering  where  a  signal  is  projected  onto  the  subspace 
of  all  signals  which  satisfy  a  certain  constraint,  we  will  re¬ 
fer  to  this  estimator  as  the  sample  sequence  filter.  Since 
W  is  the  intersection  of  two  convex  spaces  we  employ  the 
method  of  projection  onto  convex  sets  (POCS)  in  which  the 
estimate  is  first  projected  onto  Wi  and  then  onto  W2.  This 
iterative  procedure  continues  until  the  improvement  in  like¬ 
lihood  gained  with  each  iteration  is  negligible.  Since  W\  is 
a  vector  subspace  the  projection  of  a  vector  X  onto  W\  is 
given  by 

L 

X'  =  £(X,*<>*/  (16) 

1=1 

where  are  the  members  of  an  orthonormal  basis  of  W± 
and  L  is  the  dimension. 

Projection  onto  W2  for  the  given  inner  product  is  found 
in  [6],  First,  the  eigendecomposition  is  determined: 

xm  =  rmAmr-1  (17) 

Then  the  projection  onto  the  set  of  non-negative  definite  ma¬ 
trices  is  given  by  setting  the  negative  eigenvalues  to  0: 

Xm  =  rmrnax(Am,0)rm1.  (18) 

The  projection  of  the  sequence  X  onto  W2  is  the  element¬ 
wise  projection  of  each  Xm  as  described  by  this  equation. 

The  sample  sequence  filter,  by  its  definition,  finds  the 
sequence  which  is  in  the  constraint  space  and  the  closest  to 
the  sample  sequence  by  the  distance  metric  given  in  (15). 
Experience  has  shown,  however,  that  the  best  estimate  is 
rarely  the  closest  to  the  sample  sequence.  We  therefore  pro¬ 
pose  searching  the  constraint  space  for  the  maximum  likeli¬ 
hood  estimate  using  the  filtered  sample  sequence  as  a  start¬ 
ing  point. 

Each  of  the  search  algorithms  which  we  will  consider 
proceed  by  calculating  a  search  direction,  D  e  Wi,  along 
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which  the  likelihood  function  must  be  maximized.  That  is, 
in  each  iteration  we  determine  a  D  and  then  find  A0  such 
that 


-  (*) 

A0  =  argmax/(R  +  AD). 


The  updated  estimate  is 


R  —  R  +  AqD. 


This  iterative  process  should  be  allowed  to  continue  until 
the  gradient  is  sufficiently  close  to  being  orthogonal  to  Wx, 
that  is  until 


Therefore, 

{VZ(R;  S  —  D),  D)  =  (R_1  (S  -  D)R_1  -  R-1,  D) 

=  {V/(R,S)-R"1DR-1,D> 

=  -{R_1DR_1,D) 

Since  D  €  W\  we  know  that  (VJ(R;S  -  D),D)  =  0. 
Therefore 


(R"1DR"1,D)  =  0. 

It  can  be  shown  that  this  implies  that  R^D™  =  0  for  all 
m.  Therefore  Dm  =  0  for  all  m  which  is  a  contradiction. 


EL(vz(ft;S),$i)2  <e 

{VZ(R;  S),  VZ(Jt;  S)) 


(19) 


Observe  that  since  l  is  defined  only  on  the  interior  of  W2  and 
D  6  Wi,  the  estimate  will  be  within  the  constraint  space  at 
each  iteration. 

Perhaps  the  most  obvious  approach  to  calculating  D  is 
to  use  the  gradient  in  (14).  D  can  be  the  projection  of  the 
gradient  onto  W\ .  Alternatively,  a  conjugate  gradient  direc¬ 
tion  can  be  calculated  by  incorporating  memory  of  previous 
search  directions.  We  suggest,  however,  a  modification  of 
the  inverse  iteration  algorithm  proposed  by  Burg,  et  al  in 
[1].  Burg’s  algorithm  was  designed  for  estimation  of  a  sin¬ 
gle  matrix  rather  than  a  sequence  of  matrices  but  is  easily 

-  (») 

generalized  for  sequences.  For  some  estimate  R  select  a 

a  (i)  _ 

search  direction,  D ,  such  that  V/  (R  ;S— D)  is  orthogonal 

_  A  (i) 

to  W\.  Clearly,  if  D  =  0  then  R  is  the  maximum  likeli¬ 
hood  estimate  since  the  likelihood  gradient  is  orthogonal  to 
the  constraint  space  at  that  point. 

Before  the  modified  inverse  iteration  algorithm  may  be 
seriously  considered,  though,  one  must  ask  whether  a  sta¬ 
ble  point  of  the  algorithm  maximizes  the  likelihood  func¬ 
tion  within  the  constraint  space.  That  is,  does  each  iteration 
of  the  algorithm  lead  to  an  improvement  in  likelihood  for  a 
nonzero  search  direction?  The  answer  is  yes,  as  shown  by 
the  following  theorem: 


Theorem  3  Suppose  there  exists  D  ^  0  €  Wx  such  that 
V/(R;  S— D)  is  orthogonal  to  W±.  Then  there  exists  A  €  R, 
A  #  0,  such  that  l( R  +  AD;  S)  >  l( R;  S). 

Proof:  By  way  of  contradiction,  suppose  that 

argmax/(R  + AD;S)  =  0. 

This  implies  that  VZ(R;  S)  is  orthogonal  to  D.  That  is, 

<V/(R;S),D>  =  0. 


We  now  concentrate  on  finding  the  direction  which  sat¬ 
isfies  the  necessary  condition  on  the  gradient.  This  is  equiv¬ 
alent  to  finding  D  which  satisfies 

{R-1(S-D)R-1-R-1,$i)  =  0  (20) 

for  all  i.  We  note  that  this  is  a  system  of  equations  which  are 
linear  in  D  and  that  therefore  a  closed-form  solution  exists. 
Since  D  G  Wi  there  exist  real  ct3  such  that 

6  =  (21) 

J 

Substituting  (21)  into  (20)  and  rearranging  we  get 

(R-1(S  —  D)R-1  —  R-1, 

=  (R-1(S  -  £  a^OR"1  -  R"1,  #«> 

j 

=  (Vi(R;  S),  h)  ~  J2  ai *<>•  (22) 

3 

Therefore  we  need  only  find  a  €  RL  such  that 

Aa  =  B  (23) 

where 

A  ij  =  (R-1^^-1,  (24) 

Bi  =  <Vl(M)a*,>.  (25) 

4.  COMPUTER  SIMULATION 

We  have  simulated  a  uniform  linear  array  (ULA)  consist¬ 
ing  of  N  =  5  isotropic  sensors  which  is  rotating  with  ro¬ 
tational  velocity  u  about  the  center  element.  The  axis  of 
rotation  is  orthogonal  to  the  axis  of  the  ULA.  There  are  3 
source  fields  impinging  upon  the  array  which  originate  at 
(azimuth, elevation)  =  (45°,  0°),  (85°,  20°),  and  (110°,  0°). 
Here  azimuth  is  the  the  angle  made  with  the  axis  of  the  ar¬ 
ray  at  t  =  0  and  within  the  plane  of  rotation.  Elevation 
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Figure  1 :  Algorithm  convergence  rate  comparison. 


is  the  angle  made  with  the  plane  of  rotation.  For  example, 
(90°,  10°)  would  describe  a  direction  orthogonal  to  the  ini¬ 
tial  array  axis  and  10°  above  the  plane  of  rotation.  Each 
of  the  sources  are  assumed  to  be  narrowband  with  wave¬ 
length  A  and  the  separation  between  elements  in  the  ULA 
is  The  power  of  each  source  at  the  array  is  30dB,  15dB, 
and  20dB  respectively.  Receiver  noise  is  OdB.  The  rota¬ 
tional  velocity  of  the  array  is  u;  =  27 r  rad/sec,  the  sampling 
rate  is  Fs  ~  32s"1,  and  the  number  of  samples  collected 
is  M  =  16.  Therefore,  the  array  gathers  16  data  vectors 
while  completing  a  half  rotation.  Since  the  statistics  of  the 
data  vector  change  dramatically  over  this  observation  inter¬ 
val  one  expects  the  covariance  estimator  in  (1)  to  perform 
badly.  It  is  unclear  even  what  steering  vector  to  use  with 
this  covariance  estimate  in,  for  example,  an  MVDR  or  MU¬ 
SIC  estimator.  That  makes  this  scenario  a  good  candidate 
for  covariance  matrix  sequence  estimation. 

Each  of  the  algorithms  considered  begins  by  calculating 
the  sample  covariance  matrix  sequence  and  filtering  it  by 
the  method  of  POCS  using  the  projections  in  (16)  and  (18). 
The  <1/  are  obtained  by  Gram-Schmidt  orthonormalization 
of  the  set  of  vectors  #(©)  where  ©  is  a  discretization  of  the 
2-sphere.  For  this  example,  the  dimension  of  W\  is  L  = 
76.  Note  that  the  dimension  of  Vjv,m  is  N2M  —  400  and 
we  have  managed  to  eliminate  324  parameters  of  the  matrix 
sequence  by  applying  information  about  the  motion  of  the 
array. 

The  ML  search  routine  follows  the  filtering.  We  applied 
a  gradient  and  conjugate  gradient  search  algorithm  in  addi¬ 
tion  to  the  modified  inverse  iteration  algorithm.  The  likeli¬ 
hood  of  the  estimate  at  each  iteration  is  plotted  in  Figure  1 
for  each  of  the  algorithms. 

Observe  that  the  gradient  and  conjugate  gradient  algo¬ 
rithms  converge  to  points  with  the  same  likelihood.  The 


Figure  2:  MVDR  spectrum  estimated  from  the  first  matrix 
in  the  sequence.  The  dashed  line  is  that  obtained  from  the 
sample  sequence  filtering  procedure.  The  solid  line  was  cal¬ 
culated  from  the  ML  sequence. 


conjugate  gradient  reaches  this  point  in  fewer  iterations,  which 
is  to  be  expected.  However,  the  convergence  point  of  the  in¬ 
verse  iteration  algorithms  exceeds  the  likelihood  of  the  es¬ 
timate  obtained  from  either  gradient  algorithm  after  only  a 
few  iterations.  Inspection  of  the  likelihood  gradient  at  what 
appears  to  be  the  convergence  point  of  the  gradient  algo¬ 
rithms  reveals  that  it  is  not  orthogonal  to  W\  and  that  while 
successive  iterations  yield  only  slight  improvement  in  like¬ 
lihood,  they  have  failed  to  reach  a  local  maxima.  One  possi¬ 
bility  is  that  they  have  stumbled  upon  a  ’’ridge”  in  the  like¬ 
lihood  function.  It  is  clear  that,  in  this  example  at  least,  the 
inverse  iteration  algorithm  reaches  a  solution  in  fewer  itera¬ 
tions  than  even  the  conjugate  gradient  algorithm.  It  should 
be  noted,  however,  that  finding  the  solution  to  (23)  requires 
more  compuation  than  calculating  the  likelihood  gradient 
and  projecting  it  onto  W\ . 

To  demonstrate  the  validity  of  the  ML  estimate,  the  MVDR 
spectrum  corresponding  to  the  first  matrix  in  the  sequence 
has  been  calculated  and  plotted  in  Figure  2.  The  spectrum 
is  calculated  using 


a\e)  = 


_ i _ 


(26) 


Ri  is  the  first  matrix  in  the  sequence  obtained  by  the  in¬ 
verse  iteration  algorithm  since  the  other  two  algorithms  failed 
to  produce  a  maximum  likelihood  estimate.  The  position  of 
each  of  the  sources  is  easily  ascertained  from  the  plot  as  is 
a  feeling  for  their  intensities.  Also  plotted  is  the  spectrum 
obtained  from  just  the  sample  sequence  filter.  While  peaks 
which  correspond  to  two  of  the  sources  can  be  seen,  the 
third  is  lost  and  the  background  noise  is  quite  high.  This 
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demonstrates  the  necessity  of  the  ML  search  algorithms. 

5.  CONCLUSIONS 

We  have  developed  an  algorithm  for  estimating  the  sequence 
of  matrices  which  are  the  covariances  of  the  data  vectors  of 
a  time-varying  sensor  array.  Since  each  matrix  in  the  se¬ 
quence  is  structured  to  satisfy  (3)  this  may  be  classified  as 
a  structured  covariance  estimation  algorithm  in  which  the 
sequence  of  matrices  itself  is  structured.  This  method  will 
have  good  performance  for  time- varying  arrays  in  which  the 
array  motion  is  periodic,  as  with  the  rotating  ULA,  since 
the  constraint  space  basis  need  not  be  continuously  recalcu¬ 
lated,  which  relieves  the  processor  of  some  of  the  compu¬ 
tational  burden.  It  has  been  demonstrated  that  the  modified 
inverse  iterations  algorithm  can  converge  faster  and  more 
reliably  than  a  simple  gradient  search  algorithm,  although 
with  increased  computational  complexity  per  iteration. 
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ABSTRACT 

This  paper  presents  the  development  and  performance  evaluation 
of  a  methodology  for  distinguishing  between  mainlobe  and  side- 
lobe  detections  that  arise  in  adaptive  radar  systems  operating  in 
adverse  environments.  Various  adaptive  detection  test  statistics 
such  as  the  adaptive  matched  filter  (AMF),  the  generalized  like¬ 
lihood  ratio  test  (GLRT)  and  adaptive  coherence  estimate  (ACE), 
and  combinations  of  these,  have  been  previously  analyzed  with  re¬ 
spect  to  their  sidelobe  rejection  capabilities.  In  contrast  to  these 
methods  which  are  based  on  detecting  a  single  target  with  known 
direction  and  Doppler,  the  present  method  uses  model  order  deter¬ 
mination  techniques  applied  to  the  AMF  or  GLRT  data  observed 
over  the  range  of  unknown  angle  and  Doppler  parameters.  The  de¬ 
termination  of  model  order,  i.e.,  the  number  of  signals  present  in 
the  data,  is  made  by  using  least-squares  model  fit  error  residuals 
and  applying  the  Akaike  Information  Criterion.  Comprehensive 
computer  simulation  results  are  presented  which  demonstrate  sub¬ 
stantial  improvement  in  sidelobe  rejection  performance  and  detec¬ 
tions  of  multiple  sources  compared  to  previous  methods. 

1.  INTRODUCTION 

A  variety  of  constant  false-alarm  rate  (CFAR)  adaptive  detection 
statistics  have  been  developed  and  analyzed  for  radar  target  de¬ 
tection  in  adverse  environments  [l]-[8].  Adaptive  beamforming, 
adaptive  filtering  and,  generally,  joint  space-time  adaptive  process¬ 
ing  (STAP)  methods  are  being  increasingly  considered  for  airborne 
radar  detection  of  low-Doppler  targets  immersed  in  ground  clutter 
and  subject  to  directional  noise  jamming.  An  important  issue  that 
needs  to  be  considered  is  the  sidelobe  performance  of  these  adap¬ 
tive  detection  algorithms.  “False”  sidelobe  detections  can  arise 
due  to  undemulled  interferences,  intrinsically  high  sidelobes  gen¬ 
erated  by  the  adaptive  beamforming  space-time  algorithms  used 
with  limited  data  snapshots,  and  other  reasons.  This  can  result  in 
an  unacceptably  high  false  alarm  rate.  Previous  works  have  fo¬ 
cused  on  determining  the  sidelobe  rejection  performance  of  the 
adaptive  matched  filter  (AMF)  test  [3], [6],  the  generalized  likeli¬ 
hood  ratio  test  (GLRT)  of  Kelly  [1],  the  adaptive  coherence  estima¬ 
tor  (ACE)  test  and  a  cascade  of  AMF/ACE  test  [4]  or  AMF/GLRT 
test  [8].  It  is  to  be  noted  that  all  of  these  previous  methods  are 
based  on  applying  adaptive  detection  criteria  developed  for  detect¬ 
ing  a  single  target  signal  with  known  direction  and  Doppler  in  cor¬ 
related  noise.  In  contrast  to  this,  the  present  work  uses  multiple 
maximum-likelihood  model  order  fits  to  the  AMF  or  GLRT  data 


observed  over  the  range  of  the  unknown  angle  and  Doppler  param¬ 
eters.  The  resulting  fit  error  residuals  are  used  in  the  Akaike  In¬ 
formation  Criterion  (AIC)  to  deduce  the  correct  model  order  and 
thereby  reject  “false”  sidelobe  detections,  and  improve  detection 
and  resolution  of  multiple  sources. 

2.  MAXIMUM-LIKELIHOOD  MODEL  ORDER 
DETERMINATION  USING  AMF  OR  GLRT 

We  begin  by  considering  two  well-known  adaptive  detection  meth¬ 
ods,  AMF  and  GLRT,  as  a  starting  point  for  our  new  method  de¬ 
scribed  below  and  also  for  performance  comparison  purposes.  We 
consider  an  iV- element  array  and  seek  to  determine  the  presence  of 
one  or  more  signals  in  an  observation  vector  (or  snapshot)  x  called 
the  test  cell.  The  methodology  developed  here  applies  to  the  gen¬ 
eral  STAP  problem  where  the  data  vector  x  can  be  a  concatenated 
space-time  vector  of  array  element  data  and  coherent  pulse  sam¬ 
ples;  however,  the  computer  simulation  results  presented  in  section 
5  use  only  simulated  spatial  array  data  so  our  development  here 
will  be  mainly  presented  in  that  context 

Consider  then  that  the  AMF  [3]  and  GLRT  [1]  metrics  have 
been  computed  as  a  function  of  angle  (azimuth)  and  result  in  the 
following  test: 

|d(0)WR-1x|2  ,  „  ,2H, 

d(i>)»fc->dm  H*>  *1  m 

where  d(0)  is  the  signal  steering  vector  for  angle  6 ,  i.e.,  the  array 
response  vector,  R  is  the  sample  covariance  matrix  of  the  interfer¬ 
ence  plus  noise  (whose  true  covariance  matrix  is  R),  based  on  an 
auxiliary  set  of  K  data  vectors  x»,  i  =  1, . . .  ,  K  which  share  the 
same  interference  plus  noise  only  covariance  matrix  with  the  test 
datax 


and  KctfiiMF  is  the  threshold  which  can  be  determined  numerically 
for  a  given  false  alarm  Pfa-  The  hypothesis  Hi  denotes  signal  plus 
noise  and  the  null  hypothesis  Ho  denotes  noise  only.  An  alternate 
form  is  shown  on  the  right  side  of  (1)  where  an  array  weight  vector 

w(0)  can  be  defined  as  w(0)  =  R-1d(0)  j  y^d(^)H  R'’1d(^)  . 

Equation  (1)  represents  the  adapted  array  output  for  the  test  vec¬ 
tor  x  normalized  by  the  output  interference  plus  noise  power. 
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Fig.  1.  Jamf(0)  function  for  a  20dB  signal  at  broadside,  iV  =  10, 
K  =  100,PFA=  10“6. 


In  order  to  control  the  sidelobe  response  of  the  adaptive  ar¬ 
ray,  the  weight  vector  w(0)  is  often  computed  as  w(0)  = 

&~1d3h(0)  j yJdsh(9)HfL-ldsh(8)  ,  where  dsfc(0)  =  d(0)© 

w t  and  w T  is  an  appropriate  taper  or  shading  function,  and  © 
denotes  the  element-by-element  Schur  product. 

The  GLRT  test  is 


would  include  the  peaks  of  the  Jamf(0)  function.  We  have 


yi 

— 

w(0i)"D,a 

+ 

’  „(0i)  ' 

.  VKP  . 

.  w(e«-P)"D,a  . 

.  v(eKp) . 

or,  compactly, 

Y  =  Ha  +  v,  (7) 

where  H  =  WHDS,  and  W  =  [w(0i),  •  •  •  ,  w(0*>)],  and  v  = 
\v(6i ),•■■  ,  v(6kp  )]t.  The  covariance  matrix  of  v  is 

Rv  =  £[w"]  =  W"RW.  (8) 

Since  the  order  of  the  square  matrix  Rv  is  Kp  and  the  transfor¬ 
mation  in  (8)  necessarily  yields  rank(Rv)  <  JV,  it  follows  that  we 
must  have  Kp  <  N  for  Rv  to  be  nonsingular.  Hence  we  require 
that  m  <  Kp  <  TV.  Denote  the  sample  covariance  matrix  of 
v  as  RV‘  Under  the  assumption  of  Gaussian  statistics  for  the  in¬ 
terference  plus  noise  vector  n,  the  maximum-likelihood  estimates 
of  the  amplitude  vector  a  and  the  angles  ©s  =  [6sl)  •  •  -  ,0sm] 
are  obtained  by  minimizing  the  nonlinear  weighted  least-squares 
criterion 

Jml( a,  ©,)  =  [Y  -  Ha]"  R^1  [Y  —  Ha] 

=  ||R^/2[Y-Ha]||2,  (9) 


Jglrt(0)  = 


(3) 


where  Ky  is  the  threshold  which  can  be  determined  for  a  given 
false  alarm  Pfa-  The  Jamf(0)  or  Jglrt($)  are  evaluated  at  some 
discrete  set  of  points  in  the  angle  8  which  covers  the  range  of  ex¬ 
pected  target  angles.  Note  that  as  far  as  variation  with  8  is  con¬ 
cerned,  Jglrt(0)  is  merely  proportional  to  Jamf(0)  since  the  de¬ 
nominator  in  (3)  does  not  depend  explicitly  on  the  search  variable 
8,  An  example  of  the  Jamf(0)  function  for  a  single  target  is  shown 
in  Figure  1.  Note  that  if  all  peaks  above  the  threshold,  which  has 
been  set  for  a  probability  of  false  alarm  PFa  of  10“ 6 ,  were  to  be 
considered  detections  the  figure  shows  that  there  should  be  six  de¬ 
tections  of  which  five  of  them  would  be  false  alarms  (solid  line). 
Even  if  a  Chebyshev  taper  with  — 50dB  sidelobe  level  is  used,  there 
are  still  two  false  detections  (dashed  line).  The  shading  is  only 
partly  effective  in  the  presence  of  interferences,  in  this  case  one 
jammer  at  —30  degrees. 

Now  assume  that  the  test  data  vector  contains  m  target  signals, 
m  =  0, 1, . . .  ,  M  where  the  maximum  number  M  may  be  known 
from  system  considerations.  Then, 


x  =D3a  +  n,  (4) 

where  Ds  =  f. d(83i ),***  ,d (8sm)]  is  a  N  x  m  matrix  of  target 
steering  vectors  and  a  is  an  m  x  1  vector  of  complex  amplitudes 
of  the  m  signals.  The  complex  value  of  the  Jamf(0)  function  rep¬ 
resents  the  application  of  the  weight  vector  w(0)  to  the  vector  x 
resulting  in  die  expression 

V(0)  =  w(0)"x  =  w(0)"Dsa  +  v(0),  (5) 

where  v(9)  =  w((?)"n.  We  assume  that  y{9)  has  been  evaluated 
at  Kp  distinct  points  8\ , . . .  ,  8kp  ,  where  Kp  >m.  These  points 


where  R^I/2  is  the  square-root  of  the  Hermitian  positive-definite 
matrixJR^ 1  and  ||-||  denotes  the  Euclidean  norm  of  a  vector.  Let 

Z  =  Rv  1/2  Y,  the  “whitened”  data  vector  and  Hw  =  R71/2H. 
Then, 

JML(aJ0s)  =  ||Z-H11,(©5)a||2.  (10) 

For  a  given  0S,  as  is  well  known,  Jml  is  minimized  with  respect 
to  a  when 


a=  [h"(©,)H„(©»)]  lH^(©,)Z.  (11) 

Substitution  of  a  as  given  by  (11)  into  (10)  yields  the  weighted 
least-squares  residual  Jml  as 

JmlCS,©,)  =  ||(I  —  P(©s))  Z||2  ,  (12) 

where  P(©,)  =  Hw(©,)  [h£ (©, )H„ (©,)]" 1 H" (©s )  is  the 
orthogonal  projection  operator  and  I  is  the  identity  matrix.  Equa¬ 
tion  (12)  can  be  further  minimized  with  respect  to  ©s  yielding  the 
maximum-likelihood  estimate  0S.  However,  it  is  noted  that  this 
is  a  nonlinear  optimization  problem  which  may  be  computation¬ 
ally  expensive  to  solve  for  m  >  2.  For  most  of  the  sidelobe  de¬ 
tection  problems  considered  here  involving  comparable  strength 
targets  that  are  likely  to  be  separated  from  each  other  by  more 
than  a  beamwidth,  the  locations  of  the  peaks  of  the  Jamf(0)  func¬ 
tion  (which  can  be  readily  computed)  provide  a  reasonably  accu¬ 
rate  estimate  of  ©s  and  are  used  to  evaluate  (12).  However,  for 
some  problems,  e.g.,  the  detection  and  resolution  of  a  weak  source 
in  presence  of  a  strong  source,  the  location  of  the  global  peak  of 
Jamf(0)  may  be  taken  as  the  angle  estimate  8\  corresponding  to 
one  source  while  82  is  varied  so  as  to  minimize  (12),  keeping  81 
fixed. 
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It  is  noted  that  the  preceding  development  has  been  given  in 
“beam-space”  since  this  reduces  computations  and  is  most  appro¬ 
priate  for  resolving  sidelobe  detections  obtained  with  using  the 
Jamf(0)  function  (a  normalized  beam-space  function).  It  can  be 
seen  that  the  element-space  solution  can  be  obtained  either  directly 
or  from  the  preceding  development  by  choosing  Kp  —  N  and  W 
to  be  the  N  x  N  identity  matrix.  A  simulation  example  using  the 
element-space  solution  is  given  in  section  5. 

The  number  of  target  signals  is  determined  by  applying  the 
procedure  described  above  for  model  orders  m  =  1,2,...  ,M 
and  choosing  that  m  for  which  the  Akaike  Information  Criterion 
[9]  ,[10]  given  below  is  a  minimum: 

AIC(m)  —  —  log(Likelihood  functionja,  ©s ,  m) 

-1-  (number  of  independently 
adjusted  parameters  in  model) 

=  JML(a,es)  4-  3m,  (13) 

where  Jml^,©,)  is  given  by  (12)  and  the  approximate  estimate 
©s  discussed  above  is  used.  The  method  derived  here  is  referred 
to  as  the  Multi-Target  AMF  (MT-AMF)  method. 


4.  MULTI-TARGET  GLRT 

Although  this  paper  has  emphasized  the  multi-target  AMF  in  the 
development  and  performance  evaluation,  it  is  noted  here  that  the 
authors  have  derived  [13]  a  generalization  of  Kelly’s  GLRT  adap¬ 
tive  detection  statistic  [1]  to  multiple  targets.  It  is  shown  in  [13] 
that  the  multi-target  version  of  Kelly’s  GLRT  for  M  targets  located 
at  angles  ©3  =  [0ai,  •  ■  •  ,  0sm]  is  given  by 

||p(©*)y  |* 

=  ,  — U,  (16) 

1  +  -k  lly  I 

where  P(©3)  =  D„(©.)  [D|(©,)D„(©,)]_1  D*(©s)  and 
Du.  (©.,)  =  RT1/2D,, (©.,).  R~;/2  is  the  square-root  of  the 
Hermitian  positive-definite  matrix  R‘  _1.  y  =  R“1/2x  is  the 
“whitened”  data  vector  and  P(©,)  is  the  orthogonal  projection 
operator  that  projects  any  vector  onto  the  subspace  spanned  by  the 
columns  of  DS(QS)  (i.e.,  the  subspace  spanned  by  the  steering 
vectors  of  the  M  targets). 

5.  PERFORMANCE  EVALUATION 


3.  DIAGONAL  LOADING 

Diagonal  loading  is  a  simple  and  commonly  used  procedure  for 
sidelobe  reduction.  It  is  often  used  when  the  number  of  snapshots 
K  is  small,  e.g.,  less  than  twice  the  number  of  elements.  The 
diagonal  loading  operation  simply  adds  a  diagonal  matrix  to  the 
sample  covariance  matrix  to  overweight  its  diagonal  elements,  i.e.. 


The  Pfa  of  the  GLRT  test  is  given  by  [3] 

PFAGLRT  =  (1  +Xa)L  >  <17) 

where  L  =  K  +  1  -  IV,  a  =  7/  (1  +  7) ,  and  7  is  the  threshold 
term  of  (3).  The  threshold  for  the  AMF  is  determined  by  evaluating 
the  following  integral  using  numerical  integration  and  bisection 
iterations  as  in  [3]: 


Rdl  =  R  +  <tI,  (14) 

where  <7  is  the  diagonal  loading  factor.  In  the  case  of  uncorre¬ 
lated  interference  and  noise,  diagonal  loading  modifies  the  sample 
covariance  matrix  at  the  cost  of  noise  enhancement.  In  the  case  of 
correlated  interference,  a  large  amount  of  diagonal  loading  also  de¬ 
grades  the  adaptive  interference  cancellation.  However,  it  has  been 
shown  that  a  reasonable  amount  of  cr  can  dramatically  improve  the 
performance  for  small  K. 

When  diagonal  loading  is  applied,  the  AMF  function  is  given 
by 


Jamf(0)  = 


d(<?)*R^x|2 

d(0)«R^RR^d(0)' 


(15) 


Additional  tapered  weight  can  be  applied  by  replacing  d(0)  by 
dsfc(0).  In  the  matched  filter  (MF)  case,  i.e.,  K  =  00,  the  de¬ 
tection  statistic  does  not  change  when  diagonal  loading  is  applied. 
However,  in  the  case  of  limited  snapshots,  the  determination  of  the 
threshold  for  a  given  Pfa  seems  to  be  analytically  intractable  [11]. 
Thus,  a  Monte  Carlo  computation  is  required.  For  an  uncorrelated 
interference  and  noise  case,  the  authors  in  [12]  have  shown  im¬ 
provement  of  signal  detection  for  small  K  using  diagonal  loading. 
In  this  paper,  we  show  similar  improvement  of  Pd  in  the  case  of 
correlated  interference.  In  addition,  we  apply  the  MT-AMF  to  the 
diagonally  loaded  AMF  function  to  further  reduce  false  sidelobe 
detections. 


(18) 


(19) 


is  the  central  Beta  density  function,  and  p  is  the  loss  factor  which 
considers  the  SNR  loss  due  to  finite  number  of  snapshots  in  the 
sample  covariance  matrix.  The  analytic  form  of  the  probability 
of  detection  for  a  single  source  is  also  given  in  [3]  which  we  ex¬ 
cluded  for  brevity.  Our  Monte  Carlo  simulation  results  have  been 
confirmed  to  match  these  analytical  curves. 

We  consider  a  linear  equally  spaced  array  of  10  elements  with 
half- wavelength  spacing  (nominal  beamwidth  =  12  degrees)  for 
most  of  the  simulations  provided  in  this  section.  A  noise  jammer 
signal  of  strength  40dB  relative  to  thermal  noise  is  placed  at  —30 
degrees  and  the  Pfa  is  set  to  be  10“6.  The  scanning  angles  are  from 
—50  to  50  degrees  in  azimuth.  A  single  source  of  varying  SNR  is 
placed  at  broadside  and  the  performance  of  the  algorithms  in  single 
source  detection  and  false  sidelobe  rejections  are  compared.  The 
AMF  detection  only  relies  on  the  peaks  above  the  given  threshold, 
but  the  MT-AMF  test  takes  the  peaks  (for  all  simulation  examples 
except  the  last  one)  and  tests  for  model  order  m  —  1  and  2.  If 
m  =  1  is  decided,  the  overall  peak  is  retained  as  an  indicator  of  a 
single  signal  and  the  other  peaks  are  rejected.  The  probability  of 
detecting  the  mainlobe  signal  is  plotted  in  Figure  2,  regardless  of 
the  number  of  peaks  or  model  decisions,  after  5000  Monte  Carlo 
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runs.  We  observe  no  loss  in  the  detection  for  the  MT-AMF  method. 
Then,  the  probability  of  false  sidelobe  detections  is  plotted  for  the 
two  algorithms  in  Figure  3.  The  AMF  gives  rise  to  high  false  side- 
lobe  detections  at  high  SNR,  but  the  MT-AMF  greatly  reduces  the 
false  sidelobe  detections  and  its  probability  also  saturates  as  SNR 
increases.  The  false  sidelobe  detections  of  the  proposed  method  go 
down  rapidly  for  increasing  K  and  the  lower  bound  is  for  K  =  oo, 
which  is  the  multi-target  matched  filter.  For  tapered  weight  vector 
w(0),  we  also  compare  the  sidelobe  rejections  performance,  as 
depicted  in  Figure  4.  Note  that  the  use  of  a  taper  with  the  con¬ 
ventional  AMF  method  only  reduces  sidelobe  detections  slightly 
at  the  cost  of  a  slight  decrease  in  mainlobe  detection  probability 
(not  shown).  However,  the  use  of  model  order  determination  using 
AIC  with  tapered  AMF  data  shows  almost  the  same  dramatic  im¬ 
provement  in  reducing  false  sidelobe  detections  as  before  with  the 
same  mainlobe  detection  probability  as  the  conventional  tapered 
AMF. 

The  same  single  source  scenario  except  for  a  PFa  of  10~3  and 
K  =  20  using  diagonal  loading  and  tapered  weights  is  further 
studied.  Monte  Carlo  simulations  are  performed  to  determine  the 
thresholds  which  yield  the  equivalent  PFa  for  various  levels  of  di¬ 
agonal  loading.  Note  in  this  case  the  PFA  accounts  for  not  only 
the  noise  but  also  the  jammer  that  is  not  effectively  cancelled  due 
to  the  use  of  diagonal  loading.  The  probability  of  detecting  the 
mainlobe  signal  is  plotted  in  Figure  5.  Note  the  improved  Pd  per¬ 
formance  using  various  levels  of  diagonal  loading.  The  MT-AMF 
with  diagonal  loading  and  tapering  also  yields  identical  Pd  per¬ 
formance.  The  probability  of  false  sidelobe  detections  is  plotted 
for  the  two  methods  in  Figure  6.  As  the  diagonal  loading  level 
increases,  the  probability  of  false  sidelobe  detections  using  AMF 
lowers  most  of  the  time  (except  for  the  high  SNR  region).  On  the 
other  hand,  the  MT-AMF  shows  significant  improvement  in  reduc¬ 
ing  false  sidelobe  detections  comparing  to  the  AMF  with  the  same 
diagonal  loading  level. 

Then,  two  sources  of  equal  strength  are  placed  at  broadside  and 
45  degrees.  The  probability  of  detecting  both  sources  within  a 
±10  degrees  angle  constraint  is  plotted  for  the  AMF  and  MT- 
AMF  algorithms,  as  depicted  in  Figure  7.  We  observe  the  same 
detections  between  the  conventional  method  and  the  proposed  al¬ 
gorithm.  The  two  sources  detections  using  the  GLRT  is  plotted 
in  Figure  8.  However,  for  A'  -  20,  the  GLRT  yields  extremely 
poor  performance  in  detecting  both  sources  due  to  the  normaliza¬ 
tion  factor  in  the  denominator  of  (3).  The  derivation  of  the  GLRT 
is  under  the  assumption  of  a  single  source;  therefore,  despite  its 
advantage  in  single  source  detections,  as  depicted  in  Figure  9,  and 
sidelobe  rejections  for  lower  A',  as  depicted  in  Figure  10,  it  is  not 
an  appropriate  model  for  two  sources. 

Another  two  sources  detections  scenario  is  analyzed  for  a  lin¬ 
ear  equally  spaced  arras  of  32  elements.  One  mainlobe  source 
is  placed  at  broadside  with  an  arras  SNR  of  25dB,  and  a  second 
sidelobe  source  is  placed  at  45  degrees  with  varying  SNR  levels. 
A  noise  jammer  signal  of  strength  40dB  relative  to  thermal  noise 
is  again  placed  at  -30  degrees  and  the  PFA  is  set  to  be  10~6.  The 
MT-AMF  and  MT  element  space  methods  are  applied  to  source 
detections  with  a  starving  angle  search  of  the  weaker  source  and 
fixing  the  angle  of  the  stronger  source  at  the  global  peak  of  the 
AMF  function.  The  number  of  data  points  Kp  used  in  the  MT- 
AMF  is  nominally  N/2  and  are  taken  from  the  peaks  of  the  AMF 
function  without  the  threshold  constraint  We  count  detections  of 
both  sources  when  the  model  order  decision  yields  m  =  2  and 
the  angle  estimates  are  within  ±3.2  degrees  (nominal  beamwidth) 


Fig.  2.  Probability  of  detecting  single  mainlobe  target  signal  using 
AMF  and  MT-AMF.  Note  equal  performances  of  the  two  methods. 


Fig.  3.  Probability  of  false  sidelobe  detections  using  AMF  and 
MT-AMF.  Note  the  substantial  improvement  of  the  MT-AMF 
method  in  false  sidelobe  rejections  at  high  SNR. 


of  the  true  angle  of  arrivals.  As  depicted  in  Figure  11,  the  MT- 
AMF  method  improves  the  detections  of  both  sources  significantly 
from  the  AMF  method,  where  the  detections  are  based  on  the  top 
two  peaks  above  the  threshold.  When  the  strength  of  the  side¬ 
lobe  source  dominates,  strong  interactions  of  its  sidelobe  response 
would  perturb  the  weaker  mainlobe  source  and  reduce  the  prob¬ 
ability  of  detections.  Nevertheless,  we  can  resolve  such  problem 
by  using  the  MT-AMF  method.  The  MT  element  space  method  is 
applied  to  the  element  data  x  and  further  improves  the  two  sources 
detections;  nonetheless,  the  beam-space  MT-AMF  method  has  sig¬ 
nificant  computational  advantages  when  the  number  of  elements  is 
large.  The  ML  element  space  method,  which  searches  for  the  ab¬ 
solute  minimum  residual  on  the  two-dimensional  angle  parameter 
space  (high  computational  complexity),  is  also  shown  as  the  upper 
bound  of  the  two  sources  detections. 
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Fig.  4.  Probability  of  false  sidelobe  detections  using  tapered  AMF  Fig.  7.  Probability  of  detecting  both  sources  within  dblO  degrees 

weights  (-50dB  Chebyshev).  Note  significant  improvement  even  using  AMF  and  MT-AMF.  Note  equal  performances  of  the  two 

when  taper  is  used.  methods. 
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Fig.  5.  Probability  of  detecting  single  mainlobe  target  signal  using 
the  diagonal  loaded  and  tapered  (— 50dB  Chebyshev)  AMF  and 
MT-AMF.  Note  equal  performances  of  the  two  methods. 


AIF:  FALSE  SJDELO0E  DETECTIONS  (DIAGONAL  LOAOMG  .TARERL  N  ■  10,  K  «  20.  KT3 
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Fig.  6.  Probability  of  false  sidelobe  detections  using  diagonally 
loaded  and  tapered  AMF  weights.  Note  significant  improvement 
even  when  diagonal  loading  and  taper  are  used. 


GIRT:  TWO  SOURCES  DETECTIONS.  N  .  10.  PfA- 10 f* 
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Fig.  8.  Probability  of  detecting  both  sources  within  ±10  de¬ 
grees  using  GLRT.  Note  the  degraded  performance,  especially  for 
smaller  K. 


GLRT:  SOCLE  SOURCE  DETECTION.  N.  10.  PfA-  10"* 
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Fig.  9.  Probability  of  detecting  single  mainlobe  target  signal  using 
GLRT.  Note  superior  performance  over  AMF  and  MT-AMF  for 
small  K . 
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6.  CONCLUSIONS 


Fig.  10.  Probability  of  false  sidelobe  detections  using  GLRT.  Note 
good  sidelobe  rejection  capability  for  smaller  if  at  the  expense  of 
reduced  detections  of  two  sources  (Fig.  8). 


AMR  TWO  SOURCES  DETECTIONS.  N  •  32.  K  ■  M,  P  -ItT* 


Fig.  11.  Probability  of  detecting  both  sources  within  ±3.2  degrees 
using  AMF,  MT-AMF,  and  MT  element  space  methods.  Note  su¬ 
perior  performances  of  the  two  MT  methods. 


In  this  paper,  we  have  shown  substantial  false  sidelobe  rejection 
improvement  and  two  sources  detections  using  the  proposed  model 
order  determination  method.  The  algorithm  is  efficient  in  compu¬ 
tations  and  can  be  easily  implemented  in  existing  adaptive  radar 
systems. 
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ABSTRACT 

Space-time  least  squares  FIR  filters  have  proven  excellent 
clutter  rejection  performance  at  extremely  low  computational 
load  so  that  ground  moving  target  indication  (GMTI)  kann 
be  carried  out  in  real-time.  Staggering  the  pulse  repetition 
interval  (PRI)  is  an  appropriate  way  of  avoiding  Doppler  am¬ 
biguities  and  blind  velocities.  Fully  adaptive  space-time  pro¬ 
cessors  can  cope  well  with  staggered  echo  data.  FIR  filter¬ 
ing  techniques  are  based  on  constant  PRI  and,  therefore,  will 
suffer  some  degradation  if  the  radar  pulses  are  staggered.  In 
this  contribution  the  concept  of  re-adaptation  of  the  FIR  filter 
coefficients  at  each  PRI  is  put  forward.  It  is  shown  by  sim¬ 
ulations  that  the  total  loss  caused  by  staggering  the  PRI  is  of 
the  order  of  magnitude  of  a  few  dB.  However,  applying  a  con¬ 
stant  FIR  filter  to  staggered  data  results  in  dramatic  losses  in 
signal-to-clutter+noise  ratio. 

1.  INTRODUCTION 
1.1.  Preliminaries 

The  motion  of  an  air-  or  spacebome  radar  causes  clutter  re¬ 
turns  from  the  ground  to  be  Doppler  shifted.  The  Doppler 
shift  of  an  arrival  from  a  single  scatterer  is  proportional  to  the 
cosine  of  the  angle  of  arrival  of  the  backscattered  echo.  The 
total  of  all  clutter  arrivals  results  in  a  Doppler  broadband  sig¬ 
nal  where  the  Doppler  bandwidth  is  determined  by  the  plat¬ 
form  velocity.  The  clutter  bandwidth  degrades  the  detectabil¬ 
ity  of  slow  moving  targets.  Space-time  adaptive  processing 
(STAP)  has  been  shown  to  compensate  for  the  platform  mo¬ 
tion  effect  so  that  basically  no  losses  in  slow  moving  target 
detection  will  occur. 

The  basis  of  STAP  techniques  is  the  likelihood  ratio  (LR) 
test  which  states  that  the  space-time  echoes  received  by  a  co¬ 
herent  array  antenna  have  to  be  multiplied  with  the  inverse 
of  the  space-time  clutter  covariance  matrix,  followed  by  co¬ 
herent  signal  integration  using  a  beamformer  and  Doppler  fil¬ 
ters.  If  the  number  of  array  elements  N  and  the  number  of 
processed  echoes  M  is  large  the  matrix  inverse  may  not  be 
available  by  various  reasons: 

•  Adaptation  means  estimation  of  the  clutter  covariance 


Figure  1:  Overlapping  subarray  processor  with  space-time 
FIR  filter 

matrix.  The  number  of  operations  required  may  be  pro¬ 
hibitive  if  N  and  M  are  large. 

•  There  may  be  a  lack  of  representative  clutter  data  to 
estimate  the  covariance  matrix. 

•  The  computation  of  the  matrix  inverse  may  be  impos¬ 
sible  because  of  extensive  computational  load. 

•  The  computation  of  the  matrix  inverse  may  be  impos¬ 
sible  due  to  limited  numerical  accuracy. 

•  Implementation  of  the  LR  processor  requires  that  all 
elements  of  the  array  antenna  are  cascaded  with  digi¬ 
tized  channels.  This  is  currently  unrealistic  by  reasons 
of  cost. 
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-  elements  of  spatial  submatrices 

-  KxK  spatial  submatrices 

. .  shifted  diagonal  space-time  submatrices  (K=4,  L=3) 


Figure  2:  Matrix  scheme  for  space-time  FIR  filtering,  K=4, 
M=5 


1-2.  Subspace  Techniques 


A  lot  of  publications  deal  with  suboptimum  approximations 
(frequently  referred  to  as  subspace  techniques)  of  the  space- 
time  LR  processor  (e.g.  Ward  [li],  Klemm  [2]  Gold¬ 
stein  &  Reed  [4]). 

There  are  rank  reducing  techniques  which  conduct  clut¬ 
ter  suppression  in  the  clutter  subspace  of  the  space-time  co- 
variance  matrix  while  maintaining  the  order  of  the  order  of 
the  filter  matrix.  The  eigencanceler  type  of  architectures 
(Haimovich  &  Bar-Ness  [3])  belong  to  this  class.  Sav¬ 
ing  of  operations  is  achieved  during  the  adaptation  and  filter 
calculation  phase,  however  not  during  filtering  the  echo  data 
at  range  sample  speed. 

Order  reduction  STAP  techniques  lead  to  reduced  size  ar¬ 
chitectures  which  promise  a  reduction  of  the  computational 
load  for  adaptation,  filter  calculation  and  filterings  as  well. 
This  class  of  processors  has  specific  aptitude  to  real-time  pro¬ 
cessing. 

A  large  class  of  order  reduction  architectures  are  based  on 
certain  linear  transforms.  There  are  space-time  transforms, 
spatial  transforms  and  temporal  transforms  ([2,  Chapter  5-7]). 
For  large  M  post-Doppler  techniques  which  operate  in  the 
Doppler  domain  may  lead  to  very  efficient  receiver  schemes. 


Figure  3:  Fully  adaptive  processing,  constant  PRI 


Figure  4:  FIR  filter,  based  on  data  from  first  5  echoes,  con¬ 
stant  PRI 

13.  The  Space-Time  FIR  Filter 

1.3,1.  The  principle 

Space-time  FIR  filters  exploit  the  stationarity  of  echo  se¬ 
quences.  Klemm  &  Ender  [8]  analysed  a  least  squares 
space-time  filter  for  GMTI  processing  in  real-time.  Related 
approaches  have  been  described  by  Baranoski  [10],  Ro¬ 
man  et  al.  [9]  and  Swindlehurst  &  Parker  [7].  In  the 
concept  of  Goldstein  &  Reed  [5]  several  1 -delay  subfilters 
are  cascaded. 

The  space-time  least  squares  FIR  filter  introduced  by  The 
use  of  space-time  least  squares  FIR  filters  for  airborne  appli¬ 
cations  introduced  in  [8]  and  described  in  detail  in  [2,  Chapter 
7]  has  proven  to  be  a  highly  efficient  way  of  adaptive  ground 
clutter  suppression  for  moving  radar.  The  filter  is  closely  re¬ 
lated  to  the  Maximum  Entropy  Method  (Burg  [1]).  A  block 
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Figure  5:  FIR  filter,  sliding  calculation  of  coefficients,  con¬ 
stant  PRI 


Figure  6:  Fully  adaptive  processing,  randomly  staggered  PRI 


diagram  of  a  FIR  filter  based  GMT1  processor  is  shown  in  Fig¬ 
ure  1.  Notice  that  the  spatial  dimension  has  been  reduced  by 
subdividing  the  arra>  antenna  into  K  subarrays.  If  K  <C  N 
the  number  of  operations  for  clutter  suppression  is  strongly 
reduced.  Further  reduction  can  be  obtained  by  choosing  a 
space-time  FIR  filter  with  L  <  M  delays.  The  FIR  filter  is 
calculated  as  follows: 

•  Choose  a  segment  of  L  echoes  with  L  <  Af . 

•  Calculate  the  associated  space-time  clutter  covariance 
matrix.  It  will  be  one  of  the  submatrices  along  the  di¬ 
agonal  of  the  matrix  scheme  shown  in  Figure  2.  These 
submatrices  are  denoted  as  m  =  1, 2, 3. 

•  Calculate  the  inverse  of  the  submatrix. 

•  Select  the  first  K  x  KL  block  row  of  the  inverse  to 


become  K. 

•  Multiplying  a  N  x  1  beamformer  vector  b  with  K  re¬ 
sults  in  a  1  x  K L  vector  of  space-time  filter  coefficients 
Kb. 

It  has  been  demonstrated  that  the  temporal  filter  length 
can  be  chosen  independently  of  the  coherent  processing  inter¬ 
val  M  (CPI).  This  is  a  desirable  property,  particularly  when 
the  filter  is  used  in  a  multi-mode  radar  where  the  CPI  varies 
with  different  operational  modes.  Even  with  very  low  filter  di¬ 
mensions  (e.g.,  K  =  L  =  3,  total  number  of  coefficients:  9) 
excellent  approximation  of  the  performance  of  the  optimum 
processor  can  be  achieved. 


1.3.2.  Mathematical  description 


The  first  column  of  the  inverse  of  a  Toeplitz  matrix  is  called 
a  prediction  error  filter.  It  has  the  property  of  minimizing 
the  output  power  of  a  stationary  process  determined  by  the 
Toeplitz  covariance  matrix.  The  inverse  of  the  space-time  co- 
variance  matrix  Q  has  the  same  form  as  Q: 
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Let  us  consider  now  a  small  segment  of  L  echoes  out  of  M 
where  by  L  we  denote  the  temporal  filter  length.  We  assume 
that  L  <3C  M.  Recall  that  the  submatrices  Kj*  are  spatial,  that 
means,  they  are  related  either  the  antenna  array  (N  x  N)  or 
the  subspace  given  by  the  antenna  channels  (K  x  K). 

Then  the  order  reduced  space-time  covariance  matrix  be¬ 
comes 
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where  L  is  the  temporal  dimension  of  the  space-time  FIR  fil¬ 
ter.  Assuming  for  instance  K  =  3  and  L  =  3  this  matrix  has 
the  dimensions  9x9.  The  space-time  prediction  error  filter  is 
the  first  block  column  of  K 
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The  FIR  filter  operation  can  be  formulated  as  follows 


(5) 


is  a  shift  operator,  0  is  a  K  x  K  zero  matrix.  The  spatial  di¬ 
mension  of  the  FIR  filter  can  be  removed  by  pre-beamforming 

h  =  Kb  (6) 

so  that  the  filtering  operation  can  be  written  as 


with  o  being  a  If -dimensional  zero  vector  and  xm  is  the  sig¬ 
nal  vector  at  the  array  outputs  at  time  m.  Notice  that  z  is 
temporal  only  with  the  dimension  M  -  L  H- 1  while  the  di¬ 
mension  of  the  space-time  vector  y  was  (M  —  L  +  1)  x  K . 
The  processor  is  completed  with  a  Doppler  filter  bank  with 
Doppler  filters  of  length  M  -L- f- 1  whose  output  signals  are 
obtained  as 

d  =  Fz  (9) 

where  the  matrix  F  describes  the  Doppler  filter  bank,  for  in¬ 
stance,  the  DFT.  The  elements  of  d  are  fed  into  a  detection 
device. 

The  improvement  factor  in  SCNR  becomes 

IF(  x  =  s*  (^)HH*s(o)j)s*  (cjd)HH*s(o;d)  ■  tr( Q) 
s*  (wd)HH* QHH*s(wd)  •  s*(wd)s (ud) 

(10) 

where  we  made  the  usual  assumption  that  the  processor  is 
perfectly  matched  to  the  target  signal  vector  and  ujd  means 


Doppler  frequency.  In  the  discussion  below  we  use  the 
IFfad)  to  judge  the  efficiency  of  processing  and  the  effect 
of  parameters.  Instead  of  IF(ud)  we  show  IF( F)  where 
F  =  ujd/(2irPRF)  is  the  normalized  Doppler  frequency. 

2.  STAGGERED  PRI  RADAR 
2*1.  General  Aspects 

The  PRF  is  commonly  chosen  constant  which,  however,  has 
a  couple  of  drawbacks 

•  The  target  Doppler  cannot  be  estimated  unambigu¬ 
ously. 

•  The  clutter  filter  produces  ambiguous  notches  at  the 
blind  velocities. 

•  The  PRF  can  estimated  by  hostile  ESM  (electronic  sup¬ 
port  measure)  and  countered  with  spot  jamming. 

Alternatively  one  may  either  stagger  the  PRF  or  the  PRI.  PRF 
staggering  has  the  disadvantage  that  several  pulse  bursts  have 
to  be  transmitted  which  means  a  waste  of  radar  energy.  This 
problem  is  circumvented  by  PRI  staggering  (it  has  the  little 
drawback  that  the  FFT  algorithm  cannot  be  used  as  Doppler 
filter  bank). 

The  effect  of  PRI  staggering  for  use  with  STAP  has  been 
discussed  in  [6].  It  was  demonstrated  that  the  optimum  (LR) 
processor  can  cope  well  with  staggered  PRI,  provided  that  the 
Doppler  filters  are  matched  to  the  staggered  pulse  sequence. 

2.2.  FIR  Filtering  with  Staggered  PRI 

Now  the  question  arises  how  an  extremely  efficient  clutter  fil¬ 
ter  technique  such  as  the  adaptive  space-time  FIR  filter  can 
operate  with  staggered  PRI.  Staggering  the  transmit  pulses 
means  that  the  received  echo  sequence  is  no  longer  station¬ 
ary.  Recall  that  the  efficiency  if  the  adaptive  FIR  relies  on 
stationary  data  sequences. 

Stationarity  of  the  echo  sequence  means  that  the  space- 
time  submatrices  (m  =  1, 2, 3)  in  Figure  2  are  equal.  If  the 
pulse  sequence  is  staggered  these  matrices  are  different.  A 
straight  forward  approach  to  cope  with  this  non-stationarity 
is  to  readapt  the  filter  at  each  PRI.  That  means,  at  each  PRI 
the  space-time  submatrix  is  estimated  anew.  Then  we  obtain 
a  space-time  FIR  filter  with  time-varying  coefficients. 

The  adaptation  of  the  FIR  filter  with  each  PRI  causes  ad¬ 
ditional  expense  in  terms  of  computations.  This  is,  however, 
tolerable,  because  the  FIR  filter  is  based  on  a  small  number  of 
coefficients.  Therefore,  the  associated  time-depending  subco¬ 
variance  matrix  is  very  small  and  needs  only  very  few  clutter 
echo  samples  for  estimation.  These  can  easily  been  taken  at 
each  PRI  from  the  received  range  samples. 
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Figure  7:  FIR  filter,  constant  coefficients,  randomly  staggered 
PRI 


Figure  8:  FIR  filter,  sliding  calculation  of  coefficients,  ran¬ 
domly  staggered  PRI 


similar  to  the  one  in  Figure  3,  however,  with  slightly  broad¬ 
ened  and  deeper  clutter  notches.  In  Figure  5  we  applied  the 
principle  of  re-adaptation  on  echo  data  with  constant  PRF. 
Then  the  filter  coefficients  are  calculated  from  the  different 
submatrices  (m  =  1 ...  3).  As  can  be  noticed  Figure  5  is 
identical  to  Figure  4.  The  reason  is  obvious:  For  constant 
PRF  the  echo  sequence  is  stationary,  and  the  submatrices  are 
identical. 

Let  us  now  introduce  a  pseudorandom  stagger  code.  Fig¬ 
ure  6  shows  again  the  behaviour  of  the  optimum  LR  proces¬ 
sor.  As  can  be  seen  there  is  only  one  clutter  notch  left.  The 
ambiguities  do  not  show  up  anymore. 

Applying  a  space-time  FIR  filter  with  constant  coeffi¬ 
cients  leads  to  an  IF  curve  as  shown  in  Figure  7.  There  is 
only  one  clutter  notch,  however,  due  to  the  mismatch  of  the 
constant  filter  to  the  stagger  pattern  we  obtain  heavy  losses  in 
the  passband  of  the  filter.  A  FIR  filter  with  sliding  computa¬ 
tion  of  the  filter  coefficients  yields  an  IF  curve  as  shown  in 
Figure  8.  We  notice  that  except  for  a  loss  of  a  few  dB  a  good 
filter  characteristics  is  obtained. 

4.  CONCLUSIONS 

Space-time  least  sqares  FIR  filters  are  a  highly  efficient  way 
of  clutter  rection  in  air-  or  spacebome  radar.  Radar  opera¬ 
tion  with  staggered  PRI  may  be  an  attractive  feature  of  air¬ 
borne  pulse-Doppler  radars,  with  the  potential  of  unambigu¬ 
ous  Doppler  estimation  and  avoidance  of  blind  velocities.  The 
optimum  STAP  processor  as  suggested  by  the  likelihood  ratio 
test  can  cope  well  with  instationarities  of  the  received  echo 
sequence  caused  by  PRI  staggering.  FIR  filters  with  constant 
coefficients  are  by  nature  based  on  stationary  echo  sequences. 
Such  filters,  however,  can  be  applied  to  staggered  echo  se¬ 
quences  if  the  filter  is  re-adapted  with  every  PRI.  It  is  shown 
by  numerical  examples  that  the  time-varying  space-time  FIR 
filter  can  operate  well  on  staggered  echo  data.  The  penalty  for 
staggering  is  a  loss  in  signal-to-clutter  ratio  of  a  few  dB. 


3.  NUMERICAL  EXAMPLES 

The  principle  of  clutter  FIR  filtering  with  time-varying  coeffi¬ 
cients  is  illustrated  in  Figures  3-8.  As  example,  a  sidelooking 
radar  with  linear  array  antenna  was  assumed.  The  look  direc¬ 
tion  is  perpendicular  to  the  flight  path,  i.e.,  broadside. 

Figure  3  shows  the  improvement  factor  in  signal-to  clut- 
ter+noise  ratio  versus  the  normalized  target  Doppler  fre¬ 
quency.  The  PRF  is  constant  and  has  been  chosen  so  that 
ambiguous  clutter  notches  show  up  in  the  clutter  band  (PRF= 
4xNyquist  of  the  clutter  band).  The  primary  clutter  notch  is 
at  F  =  0,  The  other  notches  are  repetitions  due  to  Doppler 
ambiguity. 

The  same  kind  of  IF  plot  has  been  calculated  for  the 
space-time  FIR  filter  as  given  by  Figure  4.  The  curve  is  quite 
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ABSTRACT 

Multichannel  parametric  filters  are  currently  being  studied  as 
a  means  of  reducing  the  dimension  of  STAP  algorithms  for  in¬ 
terference  rejection  in  airborne  pulsed-Doppler  radar  systems , 
These  filters  are  attractive  to  use  due  to  the  low  computational 
cost  associated  with  their  implementation  as  well  as  their  near 
optimal  performance  with  a  small  amount  of  training  data  for 
a  stationary  environment  However,  these  filters  do  not  perform 
well  in  certain  types  of  non-stationary  environments ,  This  pa¬ 
per  presents  two  modifications  to  the  Space-Time  A utoRegressive 
(STAR)  filter  that  we  previously  proposed  The  first  modification 
is  based  on  the  Extended  Sample  Matrix  Inversion  (ESMI)  tech¬ 
nique  and  is  used  in  the  presence  of  range  varying  clutter  which 
arises  from  the  use  of  non-linear  antenna  arrays  or  bistatic  radar 
systems ,  The  second  modification  to  the  STAR  filter  is  for  use  in 
the  presence  of  hot  clutter  and  is  a  three-dimensional  STAP  al¬ 
gorithm,  Using  a  realistic  simulated  data  set  for  circular  array 
STAP,  we  show  that  the  modifications  to  the  STAR  filter  improve 
the  performance  when  in  the  presence  of  the  non-stationary  in¬ 
terference, 

1.  INTRODUCTION 

The  use  of  space-time  adaptive  processing  (STAP)  for  airborne 
radar  interference  mitigation  is  usually  limited  by  the  lack  of  sta¬ 
tionary  secondary  data  used  for  training  the  filter.  This  problem 
is  made  worse  when  the  radar  platform  is  operating  under  circum¬ 
stances  that  lead  to  additional  non-stationary  components  to  the 
interference.  Such  circumstances  include  the  use  of  a  non-linear 
or  non-side-looking  array  which  leads  to  a  range  variation  of  the 
clutter  statistics  or  the  presence  of  an  airborne  jamming  source 
which  leads  to  hot  clutter  or  terrain  scattered  interference. 

Partially  adaptive  STAP  filters  alleviate  this  problem  to  a  de¬ 
gree  by  taking  advantage  of  the  low-rank  nature  of  the  clutter. 
The  partially  adaptive  STAP  filters  use  fewer  degrees  of  freedom 
and  therefor  need  fewer  training  samples  than  the  fully  adaptive 
STAP  filter.  One  such  partially  adaptive  STAP  filter  that  is  dis¬ 
cussed  in  this  paper  is  the  Space-Time  AutoRegressive  (STAR) 
filter  [1].  The  partially  adaptive  STAP  filters  offer  an  improve¬ 
ment  over  the  fully  adaptive  STAP  filter  but  are  still  derived  based 
upon  the  assumption  that  the  interference  is  stationary.  When  the 
non-stationary  component  of  the  interference  follows  a  specified 
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model,  this  model  may  be  taken  into  account  to  derive  a  filter  to 
cancel  the  non-stationary  interference.  A  few  partially  adaptive 
STAP  algorithms  have  been  derived  to  account  for  range- varying 
interference  [2]  and  hot  clutter  [3]. 

Parametric  filters  (such  as  the  STAR  filter)  have  been  shown  to 
achieve  near  optimal  performance  with  a  small  amount  of  training 
data  when  the  interference  is  stationary  [4].  However,  the  perfor¬ 
mance  when  the  interference  is  non-stationary  leaves  much  room 
for  improvement  In  this  paper,  two  extensions  of  the  STAR  fil¬ 
ter  to  account  for  both  range-varying  interference  and  hot  clutter 
are  presented.  The  improvements  that  the  range- varying  Extended 
STAR  (ESTAR)  filter  offers  over  the  standard  STAR  filter  is  illus¬ 
trated  with  a  synthetic  data  set  generated  by  MIT  Lincoln  Labora¬ 
tory  that  simulates  the  output  of  a  20  element  antenna  array  whose 
elements  lie  along  a  circular  arc  of  120°  [2].  This  ESTAR  filter 
is  also  shown  to  have  better  performance  than  a  range- varying  ex¬ 
tended  post-Doppler  algorithm. 

The  three-dimensional  STAR  filter  used  to  mitigate  hot  clutter 
is  tested  using  the  same  data  set  as  above  augmented  with  syn¬ 
thetic  hot  clutter.  The  3D-STAR  filter  achieves  a  significant  im¬ 
provement  in  signal-to-interference  plus  noise  ratio  (SINR)  over 
the  standard  STAR  approach.  In  comparing  the  3D-STAR  filter  to 
a  three-dimensional  optimized  pre-Doppler  algorithm,  it  is  shown 
that  the  performance  of  the  two  filters  are  nearly  the  same  but  that 
the  3D-STAR  filter  has  a  narrower  clutter  notch.  This  narrow  clut¬ 
ter  notch  allows  for  improved  detection  of  slowly  moving  targets. 

In  the  next  section,  we  briefly  present  the  standard  data  model 
used  for  STAP  problems  and  introduce  the  notation  used  through¬ 
out  the  paper.  The  STAR  filtering  technique  is  described  in  Section 
3  as  a  background  for  the  extensions  presented  herein.  Section  4 
presents  the  range- varying  extended  STAR  filter  that  is  used  when 
the  clutter  statistics  are  range-varying.  Section  5  derives  a  3D- 
STAR  filter  used  for  the  mitigation  of  hot  clutter  and  Section  6 
shows  the  results  of  several  numerical  simulations  of  the  filters. 

2.  DATA  MODEL 

A  target  present  in  a  particular  range  bin  during  some  coherent  pro¬ 
cessing  interval  (CPI)  may  be  modeled  as  producing  the  following 
baseband  vector  signal  (after  pulse  compression  and  demodula¬ 
tion)  [5]: 

x<(t)=fa((l)e>"‘  +  ne(t)€Cn,  t  =  1,  •  •  ■  ,iV,  (1) 

where  £  is  the  range  bin  in  which  the  target  is  located,  b  is  the 
complex  amplitude  of  the  signal,  u  is  the  Doppler  shift  due  to  the 
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relative  motion  between  the  array  platform  and  the  target,  a(0)  is 
the  response  of  the  array  to  a  unit  amplitude  plane  wave  arriv¬ 
ing  from  direction  6  (azimuth  and  elevation  angles),  and  n*(f) 
contains  contributions  from  clutter,  jamming,  and  thermal  noise. 
In  (1),  we  are  assuming  an  array  of  m  elements  and  a  total  of  N 
transmitted  pulses  covering  R  range  bins. 

If  we  stack  the  N  array  outputs  into  a  single  mN  x  X  space- 
time  snapshot,  we  may  re-write  (1)  as 

•  x(l)  • 

Xe  =  j  =  bs(6,  u ’)  +  Tj  (2) 

.  x(N)  . 

where 

s(0,o>)  =  v(w)  <g»  a(0) 
v(o?)  =  [1  ejtJ  ei(iV"1)w  ]T 

V  =  [  n(l)T  •••  n(JV)T  ]T 

and  <g>  represents  the  Kronecker  product  The  vector  r\i  contains 
the  stacked  vector  samples  of  the  clutter  and  interference  for  range 
bin  t ,  and  has  an  unknown  covariance  matrix  denoted  by 

£{ntn'i}  =  R- 

The  clutter  is  neither  temporally  nor  spatially  white;  in  fact  the 
rank  of  R  is  typically  much  less  than  mN.  The  rank  ( p )  of  R  is 
important  because  it  determines  how  many  secondary  data  sam¬ 
ples  are  required  to  accurately  estimate  R.  According  to  [6],  die 
number  of  required  samples  is  on  the  order  of  2 p  to  5 p.  The  fully 
adaptive  approach  to  whitening  this  type  of  data  is  to  multiply  the 
data  by  the  inverse  square-root  of  an  estimate  of  the  matrix  R.  Be¬ 
cause  the  size  of  this  matrix  can  become  quite  large,  its  low  rank 
nature  is  exploited  to  derive  reduced-dimension  whitening  algo¬ 
rithms.  The  next  section  summarizes  the  work  in  [1]  as  a  back¬ 
ground  for  extending  the  STAR  filter. 

3.  SPACE-TIME  AUTOREGRESSIVE  FILTERING 

Following  the  derivation  in  [1],  the  STAR  approach  assumes  that  a 
set  of  L  matrices  Ho,  Hi,  •  *  - ,  Hl~i  of  dimension  m*  x  m  exist 
that  satisfy 

L—l 

^Hin(<  +  »)  =  0,  t  =  L  +  l,  (3) 

i= 0 

for  the  interference  and  clutter  in  the  primary  range  bin.  We  may 
also  write  (3)  in  the  following  two  different  ways: 

’  n(l)  n(N-L  +  1)  ‘ 

[H0  Hl-i]  ...  =0  (4) 

h*  _  n(L)  n(iV) 

N 

or 

n*  7?  =  0  ,  (5) 

where 

"H0  ...  HL_i 

Ho  Hl-i 

*  -  ..  ..  •  (6) 

Ho  *  ■  -  Hi,_i 


In  cases  where  the  clutter  is  stationary,  we  assume  that  equations  (4) 
and  (5)  also  hold  for  the  secondary  data  as  well: 

H*Nfc  =  0  (7) 

n*7)k  =  o,  (8) 

for  &  —  1,  *  *  • ,  Ns,  where  Ns  is  the  number  of  secondary  data 
snapshots  used  to  train  the  filter. 

The  matrix  R  is  mN  y.m!  (N — L+l).  If  (3)  holds  and  m’  and 
L  are  chosen  so  that  m'{N  —  L  + 1)  is  large  enough,  the  columns 
of  %  form  a  basis  for  the  space  orthogonal  to  the  clutter  and  in¬ 
terference  subspace.  Although  this  relationship  does  not  hold  in 
practice  due  to  the  presence  of  thermal  noise,  a  least  squares  so¬ 
lution  is  applied  to  approximate  the  subspace.  This  suggests  the 
following  space-time  filter  (similar  to  the  matched  subspace  detec¬ 
tors  in  [7])  be  used  for  interference  rejection: 

W ar(6,u)  =  P-hs(0,w)  ,  (9) 

where  is  the  projection  onto  the  columns  of  %: 

p H=H(nmnr1n’ .  (io> 

We  refer  to  the  implementation  of  STAP  with  the  weight  vector 
of  (9)  as  Space-Time  AutoRegressive  (STAR)  filtering.  The  STAR 
filter  weights  are  “adaptive”  in  the  sense  that  H  must  be  estimated 
from  the  secondary  data  prior  to  computation  of  war- 

4.  RANGE-VARYING  EXTENDED  STAR  FILTER 

The  STAR  filter  of  the  previous  section  is  not  designed  to  han¬ 
dle  non-stationary  interference  of  any  kind.  This  section  derives  a 
STAR-based  filter  that  assumes  the  clutter  statistics  vary  linearly 
with  range.  This  assumption  is  reasonable  if  the  training  region  is 
kept  short  The  idea  of  using  time- varying  weights  in  a  STAP  algo¬ 
rithm  was  introduced  in  [8]  as  an  extended  sample  matrix  inversion 
algorithm  and  this  idea  was  used  for  range- varying  STAP  weights 
in  [2].  This  technique  increases  the  dimension  of  the  problem  by 
a  factor  of  two  but  does  improve  the  performance  when  there  is  a 
rapidly  changing  clutter  locus. 

The  idea  behind  range-varying  weights  is  that  the  weight  vec¬ 
tor  is  a  function  of  range  (r)  to  account  for  the  non-stationaiy  clut¬ 
ter  locus.  Expanding  the  weight  vector  into  a  power  series  yields 

w(r)  =  w0  +  rw0  H - ^  -I - .  (11) 

The  assumption  is  made  that  the  clutter  locus  is  changing  slowly 
enough  that  for  a  given  collection  of  ranges  the  weight  vector  is 
linear  in  r.  Ignoring  the  higher  order  terms  in  the  Taylor  series, 
the  weight  vector  as  a  function  of  the  kth  range  bin  becomes 

wfc  =  w0  +  afeAw0,  (12) 

where  a  is  a  normalization  constant.  Defining 


the  output  of  the  filter  may  be  written  as 

z =*'[*%*  ]=****’ 
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where  Xk  is  the  extended  data  vector. 

Using  this  same  idea  for  the  STAR  filter  (i.e.,  assuming  that 
the  STAR  filter  coefficients  that  null  the  clutter  vary  linearly  with 
range)  we  can  rewrite  (3)  using  an  extended  data  vector 

gt®  **  1  [ JESV-O ] “*■ 

(15) 

Letting  AH  and  AH  be  defined  similar  to  H  in  (4)  and  H  in  (6) 
we  may  rewrite  (7)  and  (8)  as 

tH'  - »  <■» 

[*■  -  »•  <i7> 

The  filter  parameters  H  and  AH  may  then  be  estimated  using  the 
left  null  space  of  the  matrix 


N-q  N  Q 

—qQ'N-q  •  •  •  aQNQ  5 


08) 


where  Q  =  -A^-.  Following  what  was  done  in  [2],  the  constant  a 
is  chosen  as 

Q  =  V  (Ns  +  2)(iV«  TT)  (19) 

to  yield  a  “fiat”  noise  subspace. 

To  define  what  the  weight  vector  is,  let 


H  = 


H 

AH 


(20) 


so  that 


w  £<*.-/) 


s(0,w)  ' 

0 


(21) 


Filtering  the  extended  data  vector  with  (21)  is  referred  to  as  the 
Extended  STAR  (ESTAR  i  filter  When  estimating  a  range  vaiying 
weight  vector  u^ing  data  that  also  varies  with  range,  a  higher  num¬ 
ber  of  training  vectors  may  be  used  before  performance  starts  to 
degrade. 


5.  STAR  FILTERING  FOR  HOT  CLUTTER 

When  the  radar  platform  is  operating  in  an  environment  where 
there  is  an  airborne  jamming  source  present,  two  main  considera¬ 
tions  must  be  made  First,  the  hot  clutter  covariance  changes  from 
pulse  to  pulse  and  second,  the  hot  clutter  has  non-zero  correlations 
across  range  bins  [3J.  This  section  derives  a  STAR  based  filter 
that  is  effective  in  canceling  hot  clutter.  The  baseline  STAR  filter 
is  first  modified  to  handle  any  type  of  interference  that  changes 
from  pulse  to  pulse  (as  with  intrinsic  clutter  motion)  and  then  an 
additional  dimension  is  added  to  the  vector  autoregressive  filter  to 
account  for  the  correlations  across  range  bins. 

The  model  for  the  clutter  in  (3)  is  no  longer  valid  since  the  spa¬ 
tial  covariance  changes  from  pulse  to  pulse.  If  the  standard  STAR 
model  is  used  in  a  non-stationary  environment  like  hot  clutter,  it 
tries  to  account  for  the  time  variations  in  the  data  by  increasing 
the  number  of  filter  taps  required  to  achieve  clutter  cancelation. 


A  better  model  for  this  is  to  let  the  coefficients  of  the  space-time 
prediction  error  filter  change  with  time: 


Htv 


Ho(l) 


Hz,_i(l) 

) 

Ho(n)  •••  Hi-i(n)  _ 

(22) 


where  n  =  N  —  L  +  1  and  where  each  block  row  is  a  set  of 
new  coefficients  based  on  dropping  the  data  from  the  oldest  pulse 
and  adding  the  data  from  the  most  recent  pulse.  For  this  time- 
varying  STAR  filter,  a  greater  number  of  filter  parameters  must  be 
estimated  (n  times  the  degrees  of  freedom  required  for  the  standard 
STAR  algorithm)  and  therefore,  more  sample  support  is  required 
to  train  the  filter. 


To  complete  the  derivation  of  the  3D-STAR  filter,  a  few  defi¬ 
nitions  need  to  be  made.  To  clarify  the  notation,  sampling  across 
pulses  is  called  slow-time  sampling  and  sampling  across  range 
bins  is  called  fast-time  sampling.  Let  P  be  the  number  of  fast¬ 
time  samples  over  which  the  hot  clutter  is  correlated. 

In  order  to  utilize  the  fast-time  correlation  of  the  data,  an  extra 
dimension  is  added  to  the  STAR  filter.  We  assume  for  a  moment 
that  file  interference  is  stationary  across  the  pulses  (slow-time). 
This  filter  will  model  the  fast-time  and  slow-time  correlations  with 
a  two-dimensional  VAR  filter.  For  a  set  of  LJ  matrices  of  size 
Ml  x  M,  assume  that  the  clutter  obeys  the  model 


J-1L-1 

EE  +  *)  =  0,  t  =  +  l 

;'=0  i= 0 

*  =  J+l,  (23) 

where  k  —  0  is  the  range  bin  of  interest  and  n/t  (f)  is  the  spatial 
snapshot  for  the  tih  pulse  and  the  kth  range  bin.  This  may  also  be 
expressed  as 


j-i 

£«;e*+i=0  k  =  l,  -,p- J  +  l,  (24) 
j= 0 

where  Hj  is  the  matrix  defined  in  (6)  with  a  subscript  j  to  indicate 
which  fast-time  sample  it  is  associated  with.  From  this  point  we 
again  take  into  account  the  slow-time  variations  caused  by  the  hot 
clutter  by  replacing  Hj  with  the  slow-time  varying  filter  Hrvj  . 
Rewriting  this  sum  with  the  time-varying  filter  we  get 

ET  rjzd(k)  =  0,  (25) 


where 

Vzn(k)  = 

V  k 

ET  = 

_  Vk+P-1  . 

^TV,0  * 

*  * 

-I 

Htv,o 

*  *  Wrvyj- 1  . 

Assuming  that  there  is  target  energy  in  the  k  =  0  range  bin,  then 
there  will  also  be  target  energy  in  the  vectors  773^(0),  7734  (— 1), 
•  •  %  773^  (— P  +  1)  which  may  not  be  used  for  training  the  filter.  In 
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order  to  define  the  algorithm  to  find  the  filter  coefficients  let 


H(<)*  =  [  Ho,o(t)  •••  ]  (26) 

Sk(t)  =  :  (27) 

.  nk(t  +  L-1) 

f£k(t)  gjb+P- j(<) 

Gk{t)  =  :  :  (28) 

.  g*+./-i(t)  •  •  •  gt+p-i(f)  _ 

G(f)  =  [  Gi{t)  ...  gN,(t)  ].  (29) 


The  filter  coefficients  can  then  be  found  by  the  following  least 
squares  criterion: 

H  (t)  =  argmm  |H(f)*G(f)|p  t  =  1,  ••  -,N-L  + 1  (30) 

subject  to  the  constraint  that  H(f)*H(t)  =  I.  From  this  point 
the  m!  left  singular  vectors  corresponding  to  the  smallest  singular 
values  of  each  G(t )  matrix  will  be  used  to  compute  the  N  —  L  + 1 
sets  of  filter  coefficients  which  define  EL  With  a  defined  subspace, 
a  weight  vector  for  mitigation  of  hot  clutter  is 

w3jD(0,u;)  =  PhS3d(0,o;),  (31) 

where 

‘  1  ‘ 

0 

S3d(0,o;)  =  .  <8>s(0,u>).  (32) 

.  0  . 

This  3D-STAR  filter  will  require  more  training  data  than  the  STAR 
filter  (on  the  order  ofN  —  L  +  1  times  more)  due  to  the  non¬ 
stationary  prediction  error  filter  that  is  used  in  the  implementation. 
This  additional  sample  support  requirement  is  less  of  an  issue  than 
with  other  3D  implementations  because  the  STAR  approach  typ¬ 
ically  requires  much  less  secondary  data  for  good  performance. 
The  3D-STAR  filter  also  assumes  that  the  data  is  stationary  for  P 
fast-time  samples. 

6.  NUMERICAL  RESULTS 

The  algorithms  presented  herein  are  tested  using  a  data  set  created 
by  MIT  Lincoln  Laboratory  that  simulates  die  output  of  a  20  ele¬ 
ment  array.  These  elements  lie  along  a  circular  arc  of  120°  with 
radius  2.96m  and  are  assumed  to  have  a  cosine-shaped  response 
with  a  -30  dB  backlobe  for  both  azimuth  and  elevation  dimensions. 
The  airborne  platform  is  moving  with  a  velocity  of  100  m/s  above 
a  4/3  eath  model  at  an  altitude  of  9000m.  The  operating  frequency 
of  the  radar  is  taken  to  be  435  MHz,  the  radar  bandwidth  and  sam¬ 
pling  frequency  are  3.75  MHz,  the  pulse-repetition  frequency  is 
300  Hz,  and  N  —  18  pulses  are  assumed  to  be  transmitted  dur¬ 
ing  one  CPI.  Data  are  generated  for  9325  range  gates  between  20 
and  400  km  with  a  clutter-to-white-noise  power  ratio  of  40  dB  at  a 
range  of  100km. 

Hot  clutter  is  included  in  the  data  by  adding  a  term  of  the  form 


where  bj  is  the  amplitude  of  the  jammer, 

i 

c*(f)  =  a(0j)zk  +  y^b(t);gfc-j 

i=l 

is  the  contribution  of  the  hot  clutter  for  a  single  pulse  at  range  k , 
£  is  the  longest  multipath  delay,  0j  is  the  direction  of  arrival  of  the 
jammer  signal,  Zk  is  the  jammer  waveform  (white  in  both  slow  and 
fast-time),  and  b*  is  a  random  vector  that  approximates  the  sum  of 
the  spatial  steering  vectors  for  each  of  the  multipath  signals.  When 
present,  the  jammer-to-clutter  power  ratio  is  assumed  to  be  10  dB. 
When  secondary  data  are  used  to  estimate  the  clutter  covariance  or 
STAR  filter  parameters,  equal  amounts  of  data  from  range  bins  on 
either  side  of  the  target  range  bin  are  used. 

The  true  clutter  covariance  matrix  used  to  generate  the  data 
is  known  for  20  of  the  9325  range  bins,  and  thus  the  maximum 
achievable  SINR  can  be  calculated  at  these  ranges.  To  illustrate 
the  performance  of  the  algorithms  we  use  either  the  SINR  loss 
as  a  function  of  Doppler  for  an  azimuth  of  0°  or  the  “average” 
SINR  loss  as  compared  with  the  optimal  (known  covariance)  so¬ 
lution.  This  average  SINR  loss  is  defined  as  the  area  between  the 
algorithm’s  SINR  curve  and  that  achievable  assuming  R  is  known. 
This  is  depicted  in  Figure  1.  The  ESTAR  filter  will  be  compared 
to  the  range- varying  extended  post-Doppler  PRI  staggered  STAP 
algorithm  [2]  and  the  3D-STAR  algorithms  will  be  compared  with 
the  optimized  3D  pre-Doppler  STAP  algorithm  [3].  For  the  STAR 
based  filters,  M'  =  20  is  used  for  all  the  examples  and  for  the 
partially  adaptive  STAP  algorithm,  three  pulses  at  a  time  are  pro¬ 
cessed  and  a  diagonal  loading  of  about  five  times  the  noise  level  is 
used  for  sample  matrix  inversion. 

A  performance  evaluation  of  the  ESTAR  filter  at  a  range  of 
20km  is  shown  in  Figures  2  and  3.  Figure  2  compares  the  perfor¬ 
mance  of  the  ESTAR  filter  and  the  basic  STAR  filter  as  a  function 
of  L  for  Ns  =  50  (2km  training  window).  This  figure  shows 
that  the  ESTAR  filter  does  perform  better  than  the  STAR  filter  at 
close  ranges.  We  also  see  that  the  ESTAR  filter  requires  fewer  fil¬ 
ter  taps  than  the  STAR  filter  thus  offsetting  some  of  the  additional 
computational  cost  associated  with  the  extended  implementation. 
Figure  3  compares  the  performance  of  the  STAR  filters  with  the 
range-varying  extended  PRI  staggered  and  fully  adaptive  STAP 
algorithms  as  a  function  of  training  length.  Note  that  the  perfor¬ 
mance  of  the  STAR  algorithm  degrades  quickly  as  more  training 
data  is  used.  The  extended  PRI  STAP  and  ESTAR  filters  both  have 
nearly  flat  performance  as  Ns  is  increased  due  to  the  range- varying 
weights.  The  ESTAR  filter  also  has  much  better  performance  than 
the  extended  PRI  STAP  algorithm  because  it  requires  much  less 
training  data  to  converge  to  its  best  performance. 

Another  aspect  of  performance  is  the  computational  load  re¬ 
quired  to  implement  the  algorithms.  For  the  STAR  algorithms  the 
implementation  is  broken  up  into  two  steps.  The  first  step  involves 
taking  the  SVD  of  the  2 ML  x  (N  —  L  +  l)Ns  data  matrix  Af 
and  the  second  is  forming  the  projection  operator.  The  bulk  of  the 
computation  involved  in  this  second  step  is  finding  the  inverse  of 
H* H  which  is  usually  a  sparse  banded  matrix.  Taking  this  into 
account  the  computational  load  for  the  ESTAR  algorithm  is 

0(4(itfL)2(7V  -L  +  1  )N.)  +  0((ML)2(N  -L  + 1  )M'). 

For  the  parameters  of  the  circular  array  data  with  L  =  4  and  M!  — 
20,  the  computational  cost  is 

ESTAR  =  0(3.84  x  10 SNS)  +  0(1.92  x  10s). 
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Comparing  this  with  the  cost  of  the  STAR  filter  (at  L  =  5): 

STAR  =  0((ML)2(N  -L  +  l)Na) 
+0((ML)2(N  -L  + 1  )M‘) 

=  0(1.4  x  105Ar4)  +  0(2.8  x  106) 

the  ESTAR  algorithm  has  only  a  small  increase  in  computational 
load.  The  extended  PRI STAP  algorithm  has  a  computational  cost 
of 

EPRISTAP  =  0{A(MK)2(N  -K  +  \)N,) 
+0{A{MKf{N  -K  + 1  )p) 

=  0(2.3  x  105iVs)  +  0(2.0  x  107) 

where  K  =  3  pulses  that  are  processed  at  a  time  and  p  =  90  is  the 
approximate  rank  of  each  sub-CPI.  From  this  we  see  that  if  Na  is 
not  too  big  (Ns  <  100),  then  the  ESTAR  algorithm  requires  much 
fewer  computations  than  the  PRI- staggered  STAP  algorithm. 

Figures  4-6  illustrate  the  performance  of  the  3D-STAR  filter 
when  there  is  hot  clutter  present  and  when  the  direct  path  jamming 
signal  is  in  the  mainbeam  of  the  radar  system.  Figure  4  compares 
the  performance  of  the  3D-STAR  filter  to  the  basic  STAR  filter 
as  a  function  of  L.  The  3D-STAR  filter  outperforms  the  STAR 
filter  with  a  small  number  of  filter  taps  by  utilizing  the  slow-time- 
varying  taps  as  well  as  the  additional  fast  time  tap.  Figure  5  com¬ 
pares  the  STAR  filters  to  the  3D  optimized  pre-Doppler  and  fully 
adaptive  STAP  algorithms  as  a  function  of  training  data.  In  this 
case  the  pre-Doppler  and  3D- STAR  algorithms  have  a  very  similar 
performance  with  the  pre-Doppler  algorithm  slightly  outperform¬ 
ing  the  3D-STAR  filter.  However,  Figure  6,  which  shows  the  SINR 
at  Ns  =  80  or  32  km,  illustrates  that  the  3D- STAR  filter  has  a 
narrower  clutter  notch  which  results  in  a  lower  detectable  veloc¬ 
ity.  If  the  small  loss  in  performance  away  from  the  clutter  notch 
is  tolerable,  the  3D-STAR  filter  is  more  desirable  due  to  its  greater 
percentage  of  usable  Doppler  space. 

The  computational  cost  of  the  STAR  (L  =  7),  3D-STAR  (L  = 
2,  J  =  2),  and  3D-pre  Doppler  (K  =  3  pulses)  algorithms  for  the 
system  parameters  described  above  are  as  follows: 

STAR  =  0(2.35  x  105AT4)  +  0(4.7  x  106) 

3D -STAR  =  0({MLJ)2(N-L  +  1)(P-  J+1)N.) 

+0((MLJ)2(N  —  L  +  l)(P  —  J  + 1  )M‘) 
=  0(2.18  x  105AT.)  +  0(4.35  x  106) 

pre  -  Dopp  =  0((MKPf(N  -K+l)Na) 
+0((MKP)2(N  -  K  + 1  )p) 

=  0(5.18  x  10 5 Ns)  +  0(7.0  x  107) 

where  p  =  135  is  the  approximate  rank  of  the  sub-CPI  covariance 
matrix.  Again  we  see  that  the  STAR  and  3D-STAR  algorithms 
have  nearly  the  same  computational  cost  when  the  filter  orders 
are  chosen  close  to  the  best  value.  It  is  also  seen  that  the  pre- 
Doppler  algorithm  requires  a  large  number  of  computations  when 
compared  with  the  311-STAR  algorithm. 

7.  CONCLUSIONS 

This  paper  has  presented  modifications  to  the  space-time  autore¬ 
gressive  (STAR)  filter  for  two  types  of  non-stationary  interference. 


The  first  modified  filter  (ESTAR)  is  used  when  the  clutter  statistics 
are  varying  with  range  as  is  the  case  for  non-linear  antenna  arrays 
or  bistatic  radar  systems.  The  second  modification  (3D-STAR)  is 
used  in  the  presence  of  hot  clutter  which  arises  when  an  airborne 
jamming  source  is  present  These  two  modifications  provide  an 
increase  in  performance  over  the  standard  STAR  filter  when  used 
in  non-stationary  environments  without  a  major  increase  in  com¬ 
putational  burden.  We  have  shown  in  numerical  experiments  and 
computational  analysis  that  the  ESTAR  filter  is  superior  to  the  ex¬ 
tended  PRI-staggered  post-Doppler  STAP  algorithm  when  there  is 
a  rapidly  changing  clutter  locus.  We  have  also  shown  that  the  3D- 
STAR  filter  has  a  little  more  usable  Doppler  space  than  the  3D  op¬ 
timized  pre-Doppler  STAP  algorithm  and  the  3D-STAR  algorithm 
achieves  this  performance  with  much  less  computation. 
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Figure  2:  Performance  of  ESTAR  and  STAR  at  20  km  as  a  function 
of  filter  order. 


Figure  5:  Convergence  of  STAR,  3D-STAR,  pre-Doppler,  and 
fully  adaptive  algorithms  with  hot  clutter  present. 


Figure  3:  Comparison  with  PRI-Staggered  and  fully  adaptive 
STAP  at  20  km  as  a  function  of  training  length 
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Figure  6:  SINR  of  3D-STAR  and  pre-Doppler  algorithms  with 
Ns  =  80  and  hot  clutter  present. 


INTERFERENCE  ESTIMATION  AND  MITIGATION  FOR  STAP  USING  THE 
TWO-DIMENSIONAL  WOLD  DECOMPOSITION  PARAMETRIC  MODEL 


Joseph  M.  Francos 

Elec.  &  Comp.  Eng.  Dept. 
Ben-Gurion  University- 
Beer  Sheva  84105,  Israel 


ABSTRACT 

We  develop  parametric  modeling  and  estimation 
methods  for  STAP  data  based  on  the  results  of 
the  2-D  Wold-like  decomposition.  We  show  that 
the  same  parametric  model  that  results  from  the 
2-D  Wold-like  orthogonal  decomposition  naturally 
arises  as  the  physical  model  in  the  problem  of  space- 
time  processing  of  airborne  radar  data.  We  ex¬ 
ploit  this  correspondence  to  derive  computation¬ 
ally  efficient  parametric  fully  adaptive  and  partially 
adaptive  detection  algorithms.  Having  estimated 
the  parametric  models  of  the  noise  and  interference 
components  of  the  field,  the  estimated  parameters 
are  substituted  into  the  parametric  expression  of 
the  covariance  matrix  to  obtain  an  estimate  of  the 
interference-plus-noise  covariance  matrix.  Hence 
the  fully-adaptive  weight  vector  is  obtained.  More¬ 
over,  it  is  proved  that  it  is  sufficient  to  estimate 
only  the  spectral  support  parameters  of  each  inter¬ 
ference  component  in  order  to  obtain  a  projection 
matrix  onto  the  subspace  orthogonal  to  the  inter¬ 
ference  subspace.  The  proposed  partially  adaptive 
parametric  processing  algorithm  employs  this  prop¬ 
erty.  The  proposed  parametric  interference  mitiga¬ 
tion  procedures  can  be  applied  even  when  only  the 
information  in  a  single  range  gate  is  available,  thus 
achieving  high  performance  gain  when  the  data  in 
the  different  range  gates  cannot  be  assumed  sta¬ 
tionary. 

1.  INTRODUCTION 

We  propose  a  new  approach  for  parametric  modeling  and 
estimation  of  space-time  airborne  radar  data,  based  on  the 
2-D  Wold-like  decomposition  of  random  fields.  The  goal  of 
space-time  adaptive  processing  is  to  manipulate  the  avail¬ 
able  data  to  achieve  high  gain  at  the  target  angle  and 
Doppler  and  maximal  mitigation  along  both  the  jamming 
and  clutter  lines.  Because  the  interference  covariance  ma¬ 
trix  is  unknown  a  priori,  it  is  typically  estimated  using  sam¬ 
ple  covariances  obtained  from  averaging  over  a  few  range 
gates.  Next,  a  weight  vector  is  computed  from  the  inverse 
of  the  sample  covariance  matrix,  [l]-[5].  In  [8],  an  approach 
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that  bypasses  the  need  to  estimate  the  covariance  matrix 
was  presented:  The  data  collected  in  a  single  range  gate 
was  employed  to  obtain  a  least  squares  estimate  of  the  sig¬ 
nal  power  at  each  hypothesized  DOA,  through  evaluation 
of  a  weight  vector  constrained  to  null  the  unknown  inter¬ 
ference  and  noise.  In  [9]  a  simple  ad-hoc  model  of  the  clut¬ 
ter  signal  and  covariance  matrix  is  proposed.  The  model 
represents  the  spectral  density  of  the  clutter  as  a  sum  of 
Gaussian-shaped  humps  along  the  support  of  the  clutter 
ridge.  In  [10]  this  model  is  employed  to  estimate  the  clut¬ 
ter  covariance  matrix  from  the  data  observed  in  a  single 
range  gate. 

In  this  paper,  we  suggest  to  adopt  the  2-D  Wold-like  de¬ 
composition  of  random  fields,  [6],  as  the  parametric  model 
of  the  observed  data.  Employing  this  model,  we  derive 
computationally  efficient  algorithms  useful  for  parametri¬ 
cally  estimating  both  the  jamming  and  clutter  fields.  The 
estimation  procedure  we  propose  is  capable  of  producing  es¬ 
timates  of  the  interference  signals  parametric  models  even 
from  the  information  in  a  single  range  gate.  Hence,  no  av¬ 
eraging  over  a  few  range  gates  is  required,  achieving  high 
performance  gain  in  the  practical  case  when  the  data  in  the 
different  range  gates  is  non-stationary.  Having  estimated 
the  interference  terms  parametric  models,  their  covariance 
matrix  can  be  evaluated  based  on  the  estimated  parame¬ 
ters.  Moreover,  the  problem  of  evaluating  the  rank  of  the 
low-rank  covariance  matrix  of  the  interference  is  solved  as  a 
by-product  of  obtaining  the  parametric  estimates  of  the  in¬ 
terference  components.  Once  the  parametric  models  of  the 
interference  components  have  been  estimated,  several  alter¬ 
native  detection  procedures  are  available.  In  this  paper  we 
present  two  such  methods:  the  parametric  fully-adaptive 
processing,  and  the  parametric  partially-adaptive  process¬ 
ing. 

2.  THE  RANDOM  FIELD  MODEL 

In  this  section  we  shall  briefly  describe  the  2-D  Wold-like 
decomposition  of  random  fields,  [6].  Let  {y(n,  m)},  (n,  m)  € 
Z2t  be  a  complex  valued,  regular,  homogeneous  random 
field.  Then,  y(n,m)  can  be  uniquely  represented  by  the 
orthogonal  decomposition 

t/(n,  m)  =  iu  (n,  m)  4*  v(n,  m)  .  (1) 

The  field  (a;(n,m)}  is  a  deterministic  random  field.  The 
field  {u;(n,  m)}  is  purely-indeterministic  and  has  a  unique 
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Figure  1:  RNSHP  support;  example 
with  a  =  2  and  0=1. 


white  innovations  driven  moving  average  representation, 
given  by 


w(n ,  m)  =  ^  6(fc,  £)u(n  -  fc,m  -  £)  (2) 

(O,0)*(fc,<) 


It  is  further  shown  in  [6]  that  the  spectral  measures  of  the 
decomposition  components  in  (4)  are  mutually  singular.  A 
model  for  the  evanescent  field  which  corresponds  to  the  RN¬ 
SHP  defined  by  (a,  0)  €  O  is  given  by 

e<tt„s)(n,m)  =  ^  e\a-6)(n,m)  (5) 

i=l 

7<“'3)  (a,/3) 

=  s<^'0)  i™*-™!3)  exp(j27T^+  ^  (n/?  +  met)) 

ir=l 

where  the  1-D  purely-indeterministic,  complex  valued  pro¬ 
cesses  {s|Q,^(no!  —  m0)}  and  {s*a’^(na  -  m3)},  are  zero- 
mean  and  mutually  orthogonal  for  all  i  £  j.  Hence,  the 
“spectral  density  function”  of  each  evanescent  field  has  the 
form  of  a  countable  sum  of  1-D  delta  functions  which  are 
supported  on  lines  of  rational  slope  in  the  2-D  spectral  do¬ 
main. 

One  of  the  half-plane-deterministic  field  components, 
which  is  of  prime  importance  in  the  STAP  problem  is  the 
harmonic  random  field 

h(n ,  m)  =  Cp  exp  ^j27r(nup  -f  mup)^j  (6) 


where  £(o.o)*(M)  *)  <  °o;  6(0, 0)  =  1,  and  {u(nt  m)} 

is  the  innovations  field  of  (y(n,  m)}.  The  notation  ^  implies 
that  the  summation  is  performed  over  all  the  samples  that 
are  in  the  “past”  of  the  (n,m)  sample,  where  the  past  is 
defined  with  respect  to  any  selected  choice  of  NSHP  total- 
ordering  on  the  2-D  lattice.  (See,  for  example,  Fig.  1.) 
It  is  possible  to  define,  [6],  a  family  of  NSHP  total-order 
definitions  such  that  the  boundary  line  of  the  NSHP  has  a 
rational  slope.  Let  a  and  0  be  two  coprime  integers,  such 
that  q  /0.  The  angle  9  of  the  slope  is  given  by  tan  6  = 
0/a.  (See,  for  example,  Fig.  1.)  A  NSHP  of  this  type  is 
called  rational  non-symmetrical  half-plane  (RNSHP).  For 
the  case  where  a  =  0  the  RNSHP  is  uniquely  defined  by 
setting  0=1.  (For  the  case  where  0  =  0  the  RNSHP  is 
uniquely  defined  by  setting  a  =  1.)  We  denote  by  O  the 
set  of  all  possible  RNSHP  definitions  on  the  2-D  lattice, 
(i.e.,  the  set  of  all  NSHP  definitions  in  which  the  boundary 
line  of  the  NSHP  has  a  rational  slope).  The  introduction 
of  the  family  of  RNSHP  total-ordering  definitions  results  in 
the  following  countably  infinite  orthogonal  decomposition 
of  the  deterministic  component  of  the  random  field: 

v(n,  m)  =  p(n,  m)  -f  e(aij3)  (n,  m)  .  (3) 

(«,/S)<=0 

The  random  field  {p(n,  m)}  is  called  half-plane  determinis¬ 
tic.  The  field  {e(Q,£)(n,m)}  is  the  evanescent  component 
that  corresponds  to  the  RNSHP  total-ordering  definition 
(a,0)€0. 

Hence,  if  (y(n,  m)}  is  a  2-D  regular  and  homogeneous 
random  field,  then  y(n,  m)  can  be  uniquely  represented  by 
the  orthogonal  decomposition 

y(n,  m)  =  w(n,  m)  +  p(n,  m)  +  c(q>|8)  (n,  m)  .  (4) 

(a,0  )€0 


where  the  Cp* s  are  mutually  orthogonal  random  variables, 
and  (up,  Up)  are  the  spatial  frequencies  of  thepth  harmonic. 

3.  THE  STAP  MODEL  AND  THE  2-D  WOLD 
DECOMPOSITION 

The  random  field  parametric  model  that  results  from  the 
2-D  Wold-like  orthogonal  decomposition  naturally  arises  as 
the  physical  model  in  the  problem  of  space-time  processing 
of  airborne  radar  data.  In  the  latter  problem  the  target 
signal  is  modeled  as  a  random  amplitude  complex  expo¬ 
nential  where  the  exponential  is  defined  by  a  space-time 
steering  vector  that  has  the  target’s  angle  and  Doppler.  In 
other  words,  in  the  space-time  domain  the  target  model 
is  that  of  a  2-D  harmonic  component  similar  to  (6).  The 
purely-indeterministic  component  of  the  space-time  field  is 
the  sum  of  a  white  noise  field  due  to  the  internally  generated 
receiver  amplifier  noise,  and  a  colored  noise  field  due  to  the 
sky  noise  contribution.  The  presence  of  a  jammer  results 
in  a  barrage  of  noise  localized  in  angle  and  distributed  over 
all  Doppler  frequencies.  Thus,  in  the  angle-Doppler  domain 
each  jammer  contributes  a  1-D  delta  function  located  at  a 
specific  angle,  and  therefore  parallel  to  the  Doppler  axis. 
In  the  space-time  domain  each  jammer  is  modeled  as  an 
evanescent  component  with  (a,  0)  =  (1, 0)  such  that  its  1- 
D  modulating  process  is  a  white  noise  process.  The  ground 
clutter  results  in  an  additional  evanescent  component  of 
the  observed  2-D  space-time  field.  The  clutter  echo  from  a 
single  ground  patch  has  a  Doppler  frequency  that  linearly 
depends  on  its  aspect  with  respect  to  the  platform.  Hence, 
clutter  from  all  angles  lies  in  a  “clutter  ridge”,  supported 
on  a  diagonal  line  (that  generally  wraps  around  in  Doppler) 
in  the  angle-Doppler  domain.  A  model  of  the  clutter  field  is 
then  given  by  (5)  with  (a,  0)  such  that  tan  0/a  corresponds 
to  the  slope  of  the  clutter  ridge.  Since  the  rational  numbers 
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are  dense  in  the  set  of  real  numbers,  an  irrational  slope  of 
the  clutter  ridge  can  be  approximated  arbitrarily  close,  by  a 
rational  one.  Hence  any  clutter  signal  can  be  either  exactly 
modeled,  or  approximated  by  an  evanescent  field. 

We  therefore  conclude  that  the  foregoing  derivation  opens 
the  way  for  new  parametric  solutions  that  can  simplify  and 
improve  existing  methods  of  STAP. 

4.  ESTIMATION  OF  THE  COMPONENTS 

PARAMETERS:  PROBLEM  DEFINITION 

We  next  state  our  assumptions  and  introduce  some  nec¬ 
essary  notations.  Let  {y{n,  m)},  (n,m)  €  D  where  D  = 
{(i,  j)|0  <  i  <  S  —  1,0  <  j  <  T  —  1}  be  the  observed 
random  field. 

Assumption  1:  The  purely-indeterministic  component 
{w(n,  m)}  is  a  zero  mean  circular  complex  valued  random 
field. 

Assumption  2:  The  number  I  =  ]T(0  of 

evanescent  components  in  the  field,  is  a-priori  known.  This 
assumption  can  be  later  relaxed. 

Assumption  3:  For  each  evanescent  field  {e-°t,^}}  the 
modulating  1-D  purely-indeterministic  process  {s^a,jS^}  is  a 
zero-mean  circular  complex  valued  process. 

Lety  =  [y(0,0),...,j/(0,T—  l),...,y(S-  1,T -  l)]r,  and 
let  w,  e\a,?I  be  similarly  defined.  Let 

[s\a'0) (0),  s'a>5)(-/3), . . . ,  s^\-(T  -  m, 

s<?’V (a-  0),..., s^a)(a  -  (T -  l)/3), 

. . . ,  sf-0)«S  - 1  )*),.. . ,  s^dS  -  l)cc  -  (T  -  l)/3)](7) 

be  the  vector  whose  elements  are  the  observed  samples  from 
the  1-D  modulating  process  Define 

v(*,0)  = 

[  o,  a,  ...,  (T  —  l)a, 

0,  0  +  a,  ...,  £  +  (r-l)a, 

(S-l)/?,  (S  —  1)/?  +  a,  (S-l)3+(T-l)a]T(8) 

Given  a  scalar  function  f(v ),  we  will  denote  the  matrix, 
or  column  vector,  consisting  of  the  values  of  f(v)  evaluated 
for  all  the  elements  of  v,  where  v  is  a  matrix,  or  a  column 
vector,  by  /(v).  Using  this  notation,  we  define 

d<“’«  =  exp(j27r^^v<a^)  .  (9) 

Thu s,  using  (5),  we  have  that 

eja'0)=(ja’0)edja’0) ,  (10) 

where  ©  denotes  an  element  by  element  product  of  the  vec¬ 
tors. 

Note  that  whenever  na—m0  —  kct—£0  for  some  integers 
n, m, fc, £  such  that  0  <n,k  <  S  —  1  and  0  <  m, £  <  T  —  1, 
the  same  element  of  appears  more  than  once  in  the 


vector.  It  can  be  shown,  [7],  that  for  a  rectangular  observed 
field  of  dimensions  S  x  T  the  number  of  distinct  samples 
from  the  random  process  that  are  found  in  the  ob¬ 

served  field  is  Nc  =  (S-l)|a|  +  (T— 1)|/?|  +  1— (|a|  — 1)(|/?|  — 
1).  This  is  because  Nc  is  the  number  of  different  “columns” 
one  can  define  on  such  a  rectangular  lattice  for  a  RNSHP 
defined  by  (a,  0).  We  therefore  define  the  concentrated  ver¬ 
sion ,  of  to  be  an  ATc  dimensional  column  vector 

of  non-repeating  samples  of  the  process  {s-aw3)}.  Thus  for 
any  (a,  0)  we  have  that 

=  A^s^  (11) 

where  A-a’^  is  rectangular  matrix  of  zeros  and  ones  which 
replicates  rows  of 

Note  however  that  due  to  boundary  effects,  the  vector 
s-Q’^  is  not  composed  of  consecutive  samples  from  the  pro¬ 
cess  unless  |a|  <  1  or  \0\  <  1.  In  other  words,  for 

some  arbitrary  a  and  0  there  are  missing  samples  in  s\a^K 
We  note  that  the  covariance  matrix  R-Qt,/3)  which  character¬ 
izes  the  process  is  defined  in  terms  of  the  concen¬ 

trated  version  vector  s\ay0)  i.e.,  K\a'0)  =  Els^'^  (s\a'0))H] 
and  not  in  terms  of  the  covariance  matrix  of  the  vector 
\  R<os’/3)  =  .  The  matrix  R^}  is  a 

singular  matrix,  given  by  R-01’^  =  A^’^R^’^  ^A^a’^  . 

Since  the  evanescent  components  {e-*,/3)},  are  mutually 
orthogonal,  and  since  all  the  evanescent  components  are  or¬ 
thogonal  to  the  purely-indeterministic  component,  we  con¬ 
clude  that  T,  the  covariance  matrix  of  y,  has  the  form 

r  =  rPi  +  Y,  D  r^,/S)  ’  (12) 

(a,0)€O  i=l 

where  r\a'^  is  the  covariance  matrix  of 

Using  (10)  and  (5)  we  find  that 

=  ^^©(d^^d^)")  •  (13) 

5.  PARAMETRIC  ESTIMATION  OF  THE 
INTERFERENCE  COMPONENTS 

In  this  section  we  derive  a  computationally  efficient  algo¬ 
rithm  for  estimating  both  the  jamming  and  clutter  fields, 
based  on  the  above  results.  The  proposed  estimation  algo¬ 
rithm  of  the  spectral  support  parameters  of  the  evanescent 
field,  a,  0  and  is  based  on  the  observation  (see  the 

evanescent  field  model  (5))  that  for  a  fixed  c  =  not  —  m0 
(ie.,  along  a  line  on  the  sampling  grid),  the  samples  of  the 
evanescent  component  are  the  samples  of  a  1-D  constant 
amplitude  harmonic  signal,  whose  frequency  is  u\a,l3).  The 
algorithm  is  implemented  by  the  following  three-step  pro¬ 
cedure: 

In  the  presence  of  an  evanescent  component,  the  peaks 
of  the  observed  field  periodogram  are  concentrated  along 
a  straight  line,  such  that  its  slope  is  defined  by  the  two 
coprime  integers  a  and  0 .  Hence,  several  alternative  ap¬ 
proaches  for  obtaining  an  initial  estimate  of  the  spectral 
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support  parameters  of  the  evanescent  component  can  be  de¬ 
rived  by  taking  the  Radon  or  Hough  transforms,  [12],  of  the 
observed  field  periodogram.  (The  current  implementation 
employs  the  Hough  transform  for  detecting  straight  lines  in 
2-D  arrays).  However,  due  to  noise  presence,  this  estimate 
may  perturbate.  Since  on  a  finite  dimension  observed  field 
only  a  finite  number  of  possible  (a,  0)  pairs  may  be  defined, 
the  output  of  the  initial  stage  is  a  set  of  possible  (a,  0)  pairs 
such  that  the  ratio  £  is  close  to  the  ratio  obtained  for  the 
(a,  0)  pair  estimated  by  the  Hough  transform. 

For  each  possible  (a,  0)  pair  we  next  evaluate  the  fre¬ 
quency  parameter  of  the  evanescent  component,  As¬ 

suming  the  considered  (a,0)  pair  is  the  correct  one,  we 
know  that  in  the  absence  of  background  noise,  for  a  fixed 
c  =  na  —  m0  (i.e.,  along  a  line  on  the  sampling  grid),  the 
samples  of  the  evanescent  component  are  the  samples  of  a 
1-D  constant  amplitude  harmonic  signal,  whose  frequency  is 
.  Hence,  by  considering  the  samples  along  such  a  line 
we  obtain  samples  of  a  1-D  constant  amplitude  harmonic 
signal  whose  frequency  can  be  easily  estimated  using 
any  standard  frequency  estimation  algorithm  (e.c.,  the  1-D 
DFT). 

The  test  for  detecting  the  correct  (a,  0)  and  is 

then  based  on  multiplying  the  observed  signal  y(n,  m)  by 

£<<*.£) 

exp(— j27r  (n0  +  ma)),  for  each  of  the  considered  a,  0 

and  triplets,  and  evaluating  the  variance  of  this  signal 
along  a  line  on  the  sampling  grid  such  that  c  =  na  —  m3. 
Clearly,  the  best  estimate  of  a,  0  and  is  the  one  that 
results  in  minimal  variance  for  the  1-D  sequence,  as  in  the 
absence  of  noise  the  correct  a,  0  and  result  in  a  zero 
variance. 

Having  estimated  the  spectral  support  parameters  of 
each  evanescent  component,  we  take  the  approach  of  first  es¬ 
timating  a  non-parametric  representation  of  its  1-D  purely- 
indeterministic  modulating  process  and  only  at  a 

second  stage  we  estimate  the  parametric  models  of  these 
processes.  Hence,  in  the  first  stage  we  estimate  the  partic¬ 
ular  values  which  the  vectors  ^ take  for  the  given  real¬ 
ization,  i.e.,  we  treat  these  as  unknown  constants.  The  es¬ 
timation  procedure  is  implemented  as  follows:  Multiplying 

the  observed  signal  y(n,m)  by  exp(-j27r  r£^(n0  +  md)) 
and  evaluating  the  arithmetic  mean  of  this  signal  along  a 
line  on  the  sampling  grid  such  that  c  —  na-  m0 ,  we  have 

= 

Jfs  2_.  y(n,m)ex.p(-j2n-^— ^(nP+ma))  (14) 


where  Ns  denotes  the  number  of  the  observed  field  samples 
that  satisfy  the  relation  na  —  m3  =  c.  Once  we  obtained 
the  sequence  of  estimated  samples  from  the  1-D  modulating 
process  the  problem  of  estimating  its  parametric 

model  becomes  entirely  a  1-D  estimation  problem.  Assum¬ 
ing  the  modulating  process  is  an  AR  process,  and  applying 
to  the  sequence  an  AR  estimation  algorithm  (see,  e.<?.,  [13]) 
we  obtain  estimates  of  the  modulating  process  parameters, 
as  well. 


Figure  2:  Spectral  density  of  the  observed  field. 


Finally,  it  is  important  to  note  that  we  solve  the  diffi¬ 
cult  problem  of  evaluating  the  rank  of  the  low-rank  covari¬ 
ance  matrix  of  the  interference  as  a  by-product  of  obtaining 
the  parametric  estimates  of  the  interference  components: 
Denote  the  number  of  evanescent  components  (interference 
sources)  of  the  field  by  Q.  It  is  then  shown  in  [11]  that 
the  rank  of  the  interference  covariance  matrix  is  given  by 
Q  Q  Q  Q 

rank(r)  =  S  £  \*k\  +  T  £  (/?*]-  £  |a*|  £  |A|  .  In  fact 

fc=l  fe-l  k—l  k=l 

the  special  case  where  Q  =  1  and  a  =  1  is  the  well  known 
Brennan  rule,  [3],  of  the  rank  of  the  clutter  covariance  ma¬ 
trix. 


6.  PARAMETRIC  FULLY  ADAPTIVE 
PROCESSING 

Having  estimated  the  parametric  models  of  the  purely  in¬ 
deterministic  and  evanescent  components  of  the  field,  the 
estimated  parameters  can  be  substituted  into  (12)-(13)  to 
obtain  an  estimate  of  the  interference-plus-noise  covariance 
matrix  T. 

Let  vt  denote  the  target  steering  vector,  given  by 

vt  =  b(c7t)  <g>  a(i?t)  .  (15) 

Assuming  a  linear,  uniformly  spaced,  sensor  array  and  a 
uniform  CPI  are  employed  in  our  model,  the  spatial  steering 
vector  a(i?)  and  the  temporal  steering  vector  b(trr)  are  given 
by 

a(tf)  =  [1»  •  •  • . 

b(tff)  =  [iy2”V",eJ'2’r(r-1)”]  , 

respectively.  It  is  well  known  (e.p.,  [3],  p.  57)  that  the 
optimum  space-time  filter  is  given  to  within  a  scale  factor 
by 

w  =  r  \r,  ,  (16) 

The  test  statistic  z(zs,  i?)  is  then  given  by 
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Figure  3:  The  test  statistic  z(w,  ). 

*(»,«)  =  wH(tr,tf)y  =  vf(c7,j9)(r_1)Hy  .  (17) 

Let  \f  =  (F-1)Hy.  We  thus  have 

z(cr,t?)  =  vf(c?,t?)x/  =bH(c7)®a/,(i5)x/  .  (18) 

Reorganizing  the  elements  of  Xf  into  a  T  x  S  matrix  S' 
where  the  elements  of  the  kth  row  of  ^  are  %/((£  —  1)5  4- 
1) . . .  Xf  (fc5)i  we  conclude  that  for  a  linear,  uniformly  spaced, 
sensor  array  and  uniform  CPI  the  test  statistic  is  given  by 

T  S 

*(w, tf)  =  J2'52e~i2,'<p-1>we-W-1)*9(p,q)  .  (19) 

P=1  Q=1 

Thus,  z(zj,  ti)  and  q)  are  a  2-D  DFT  pair,  and  the  test 
is  equivalent  to  finding  the  2-D  frequency  where  the  2-D 
DFT  of  ^(p,  q)  is  maximal. 

To  illustrate  the  operation  of  the  proposed  solution  we 
resort  to  numerical  evaluation  of  some  specific  examples. 
Consider  a  2-D  observed  random  field  consisting  of  a  sum 
of  a  purely-indeterministic  component  (background  noise), 
a  single  evanescent  (interference)  component,  and  three  har¬ 
monic  components  (targets).  The  purely-indeterministic 
component  is  a  complex  valued  circular  Gaussian  white 
noise  field.  The  evanescent  component  spectral  support 
parameters  are  (a,  0)  =  (1,-2),  -  0.  The  modu¬ 

lating  1-D  purely  indeterministic  process  of  this  evanescent 
component  is  a  first  order  Gaussian  AR  process,  such  that 
its  driving  noise  variance  (^C1*-2))2  =  2,  and  a*1,_2)(l)  = 
—0.5.  There  are  three  targets  which  are  located  at  (0.05, 0), 
(0.15,0.15)  and  (—0.25,0.15),  respectively.  The  observed 
field  dimensions  are  48  x  48. 

Let  us  define  the  experimental  variance  of  each  of  the 
field  components  as  Ew  =  wHw  for  the  purely  indetermin¬ 
istic  component;  Ec  —  (e*a^)Ke*a’^  for  the  evanescent 
component;  and  Ehk  =  hf  h*,  k  —  1,2,3,  for  each  of  the 
harmonic  components,  where  h*  is  defined  in  the  same  way 


w  and  are  defined.  In  this  example  we  have  = 

6dB,  while  for  the  three  targets  we  have  -grj-  =  —  12.8dB, 

—  —  14.5dB,  —  — 15dB.  Due  to  the  strong  in¬ 

terference  component,  the  presence  of  the  three  targets  is 
hard  to  detect  in  the  observed  data  whose  power  spectral 
density  is  depicted  in  Fig.  2.  However  these  targets  are 
easily  detected  by  the  test  statistic  z(z?t  tf)  depicted  in  Fig. 
3.  In  Fig.  3,  z(z v,  )  is  depicted  as  a  function  of  the  two- 
dimensional  frequencies,  ie.,  angle  and  Doppler. 

7.  PARAMETRIC  PARTIALLY  ADAPTIVE 
PROCESSING 

Recall  that 

r\a'0)  =  (A^K^iA^Y)  ©(d^fd^)*)  . 

(20) 

Having  estimated  a,/?  and  u\a'^  using  the  algorithm  in 
Section  5,  the  vector  is  known.  Hence,  demodulating 
we  conclude  using  (10)  that  the  demodulated  vector 
which  we  denote  by  e^a‘^  is  given  by 

e|a,/5>  =  ef'V  0  ((d <«'&)**  f  .  (21) 

From  (11)  we  conclude  that  the  covariance  matrix  of  e-a’^is 
given  by 

.  (22) 

In  the  following  it  is  proved  that  since  a  and  (3  are 
already  known,  an  orthogonal  projection  matrix  onto  the 
low-rank  subspace  spanned  by  the  evanescent  field  covari¬ 
ance  matrix  can  be  found  without  estimating  the  paramet¬ 
ric  model  of  the  evanescent  field  1-D  modulating  process, 
and  hence  without  estimating  R^°’^.  Moreover  this  result 
enables  us  to  avoid  the  need  in  both  evaluating  the  field 
covariance  matrix,  and  in  employing  a  computationally  in¬ 
tensive  eigenanalysis  to  the  estimated  covariance  matrix. 

More  specifically,  let  us  construct  the  following  orthog¬ 
onal  projection  matrix 

((a'^Va^)-1  (A<“''3))7(23) 

It  is  easily  verified  (by  substitution)  that  T.Q’^  is  an  or¬ 
thogonal  projection  onto  the  range  space  of  f  since  for 
any  ST  dimensional  vector  v 

f|°^)v  =  f{0,wT^}v.  (24) 

Also,  (T^)2  =  T\a'0\  and  (T^^)t  =  T^. 

Note  that  since  is  a  sparse  matrix  of  zeros  and 

ones  only,  the  computation  of  T-a,^)  is  very  simple. 

The  projection  matrix  onto  the  subspace  orthogonal  to 
the  interference  space  is  therefore  given  by  (t^’^)-1  = 
I  -  Hence  by  projecting  the  demodulated  observed 

data  vector  y  =  y<*)((d\ay^)H)T  onto  the  subspace  orthog¬ 
onal  to  the  interference  subspace,  &  reduced  dimension  data 

vector  given  by  y  =  ((T^a’^)'L^  y  is  obtained,  such  that 


71 


swnf^wKajocf- 


Figure  4:  Spectral  density  of  the  field  after  being  pro¬ 
jected  onto  the  subspace  orthogonal  to  the  interference 
subspace. 


the  interference  contribution  to  the  observed  signal  is  mit¬ 
igated.  Remodulating  y  by  evaluating  y  0  d[Q,/3\  followed 
by  sequentially  applying  this  procedure  to  mitigate  each  of 
the  interference  sources,  the  detection  problem  is  reduced 
to  that  of  detecting  a  target  in  the  presence  of  background 
noise  only.  Thus,  in  the  special  case  where  the  background 
noise  is  known  to  be  a  white  noise  field,  the  statistical  test 
is  equivalent  to  finding  the  2-D  frequency  where  the  2-D 
DFT  of  the  processed  data  vector  (organized  back  into  a 
2-D  array)  is  maximal. 

As  an  example  consider  the  same  field  as  in  the  previous 
section.  Due  to  the  strong  interference  component,  the  pres¬ 
ence  of  the  three  targets  is  hard  to  detect  in  the  observed 
data  whose  power  spectral  density  is  depicted  in  Fig.  2. 
However  these  targets  are  easily  detected  in  the  processed 
data  as  illustrated  in  Fig.  4.  This  result  is  obtained  with¬ 
out  estimating  the  parametric  model  of  the  evanescent  field 
1-D  modulating  process,  and  hence  without  estimating  the 
interference-plus-noise  covariance  matrix.  Since  both  the 
estimation  of  the  interference-plus-noise  covariance  matrix, 
as  well  as  its  analysis  are  saved,  the  proposed  parametric 
partially  adaptive  processing  method  is  robust  and  compu¬ 
tationally  attractive. 
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ABSTRACT 

Hyperspectral  data  consists  of  hundreds  of  contiguous  ra¬ 
diometric  measurements  collected  passively  from  each  pixel 
in  a  scene.  Detection  capitalizes  on  exploiting  the  difference 
between  target  and  background  spectral  signatures.  Many 
detection  methods  in  hyperspectral  processing  employ  sig¬ 
nal  models  commonly  used  in  radar  even  though  it  is  an 
active  sensor.  Starting  from  a  common  signal  model,  we 
discuss  adaptive  detection  algorithms  for  hyperspectral  data 
by  outlining  fundamental  similarities  and  differences  with 
radar.  We  demonstrate  detection  using  hyperspectral  data 
through  experiments  with  real  data  and  discuss  the  funda¬ 
mental  applicability  of  adaptive  radar  signal  models  to  de¬ 
tection  in  hyperspectral  processing. 

1.  INTRODUCTION 

The  potential  of  hyperspectral  sensors  to  perform  target  de¬ 
tection  has  begun  to  emerge  as  data  from  current  and  pro¬ 
jected  sensors  has  shown  that  passive,  spectral  measure¬ 
ments  can  distinguish  targets  from  background.  The  ba¬ 
sis  for  detection  resides  in  exploiting  the  differences  in  re¬ 
flective  properties  that  occur  in  the  hundreds  of  contiguous 
spectral  bands  that  comprise  hyperspectral  signals.  Collec¬ 
tively,  these  measurements  constitute  a  vector  signal  that 
may  be  used  in  detection  algorithms  designed  to  maximize 
the  separation  between  target  and  background  signals. 

For  detection  algorithms  to  be  successful  in  operational 
scenarios,  they  must  employ  accurate  statistical  descriptions 
of  both  the  target  and  background.  Many  of  the  algorithms 
currently  in  use  have  been  adapted  from  signal  models  used 
for  detection  in  radar  systems.  Consequently,  despite  the 
significant  differences  in  the  physical  mechanisms,  a  strong 
parallelism  can  be  drawn  that  maps  the  measured  signals 
from  each  sensor  to  a  common  signal  model. 

This  work  was  sponsored  by  the  Department  of  the  Defense  under 
Contract  F19628-00-C-0002.  Opinions,  interpretations,  conclusions,  and 
recommendations  are  those  of  the  author  and  are  not  necessarily  endorsed 
by  the  United  States  Air  Force. 


2.  MODELS  FOR  HYPERSPECTRAL  SENSING 
AND  MTI  RADAR 

In  order  to  understand  the  relationship  between  the  signal 
models  for  hyperspectral  sensing  and  MTI  radar,  we  first 
explain  the  basic  concepts  behind  both  sensor  models. 

2.1.  Hyperspectral  Imaging 

Hyperspectral  sensors  passively  collect  measurements  of  ra¬ 
diation  in  hundreds  of  contiguous  spectral  bands.  Collec¬ 
tively,  hyperspectral  imaging  (HSI)  provides  continuous  cov¬ 
erage  of  the  electromagnetic  spectrum  over  a  wide  range  of 
wavelengths.  Incident  radiation  from  the  sun  follows  sev¬ 
eral  pathways  as  it  reaches  the  sensor  where  it  is  measured 
in  terms  of  radiance  (Watts/steradian/cm2/pm).  Mathemat¬ 
ically,  the  radiance  arriving  at  the  sensor,  Lsensor( A),  can 
be  described  as 

L sensor  (A)  =  Lsolar(X)p(X)r(X)  +  Lpath(  X)  (1) 

where  Lsoiar{ X)  is  the  radiance  spectrum  entering  the  at¬ 
mosphere  at  a  designated  time  and  location  as  a  function  of 
wavelength.  r( A)  is  the  atmospheric  transmittance,  and  pX 
is  the  surface  reflectance,  and  Lpath(X)  is  the  additive  path 
radiance  arising  from  interactions  with  the  atmosphere. 

In  some  cases,  processing  of  the  radiance  arriving  at  the 
sensor  can  yield  useful  results.  However,  in  most  cases,  the 
surface  reflectance,  p(A),  is  the  quantity  that  is  desired  be¬ 
cause  it  is  an  intrinsic  property  of  the  area  being  imaged  and 
is  invariant  to  differences  in  atmospheric  conditions  during 
observation.  Reflectance  is  defined  as  the  ratio  of  the  in¬ 
tensity  arriving  at  the  surface  of  an  object  to  the  intensity 
reflected  (0  <  p(A)  <  1),  and  the  recovery  of  p(X)  from 
L{ A)  is  accomplished  through  atmospheric  compensation . 
In  this  procedure,  the  surface  reflectance  for  each  pixel  is 
recovered  by  removing  the  effects  of  gaseous  and  water  va¬ 
por  absorption  in  the  atmosphere.  Atmospheric  compensa¬ 
tion  is  derived  from  radiative  transfer  models  and  is  by  no 
means  an  exact  science.  In  addition  to  being  computation¬ 
ally  demanding,  the  amount  of  error  in  the  compensation  is 
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Fig.  1.  3-D  datacubes  for  HSI. 


Fig.  2.  3-D  CPI  datacube  for  MTI  radar. 


difficult  to  quantify.  Nevertheless,  most  hyperspectral  pro¬ 
cessing  is  performed  ”in  reflectance.” 

Hyperspectral  sensors  collect  data  along  two  spatial  axes 
derived  from  the  motion  of  the  sensor  (along-track  and  across- 
track)  and  another  spectral  axis.  The  resulting  three-dimension 
cube  is  depicted  in  Figure  1.  The  spatial  resolution  in  HSI 
is  a  consequence  of  several  factors,  but  generally  can  be  de¬ 
termined  from  only  two:  instantaneous  field  of  view  (IFOV) 
and  altitude.  IFOV  is  a  parameter  describing  the  optics  that 
conveys  the  angular  expanse  of  one  element  on  the  focal 
plane  array  that  measures  radiance.  Multiplying  the  IFOV 
by  the  altitude  of  the  sensor  gives  the  pixel  size  of  the  scene. 

2.1.1.  Linear  Mixing  Model 

Hyperspectral  processing  attempts  to  exploit  the  wavelength- 
dependent  features  of  the  reflectance  spectrum  measured 
from  a  pixel.  However,  it  is  quite  common  for  the  sur¬ 
face  area  occupying  a  pixel  to  be  a  combination  of  distinct 
materials,  or  endmembers  (e.g.,  water,  trees,  vehicle),  each 
possessing  their  own  reflectance  functions.  The  reflectance 


function  of  a  mixed  pixel  is  some  combination  of  the  distinct 
reflectance  functions  of  each  endmember.  In  general,  accu¬ 
rate  physical  modelling  of  the  reflective  properties  of  mix¬ 
tures  is  not  trivial,  and  is  a  function  of  numerous  molecular 
parameters,  as  well,  as  the  proportions  in  which  the  end- 
members  appear.  Several  physically-derived  models  have 
been  proposed  to  model  mixing  under  different  conditions. 

A  common  assumption  for  describing  the  mixing  pro¬ 
cess  throughout  hyperspectral  processing  that  is  analytically 
tractable  is  that  the  reflectance  spectrum  of  a  mixed  pixel  is 
a  weighted  linear  combination  of  the  individual  endmember 
reflectance  functions,  where  the  weights  are  the  proportions 
in  which  each  endmember  appears.  Thus,  the  mathematical 
model  describing  this  recipe  for  a  mixed  pixel  is 

p 

x  =  Sa  +  n  =  £]  aiSi  +  n  (2) 

i= 1 

Here,  x  is  the  reflectance  spectrum  of  a  mixed  pixel,  and  S  is 
a  matrix  whose  P  columns  are  the  reflectance  spectra  of  the 
endmembers,  and  aisaPx  1  vector  of  non-negative  frac¬ 
tional  abundances.  The  additive  noise  vector,  n,  represents 
the  inaccuracies  in  the  model.  Two  important  constraints  on 
a  must  be  imposed.  The  non-negativity  constraint  demands 
that  a*  >  0,  i  —  1, ...,  P,  and  to  ensure  the  composition 
of  a  mixed  pixel  is  completely  accounted  for,  the  additivity 
constraint  requires  Y%=  i  ai  =  1*  Collectively,  these  con¬ 
straints  and  the  synthesis  equation  for  mixed  pixels  in  (2) 
are  referred  to  as  the  Linear  Mixing  Model  (LMM). 

2.2.  MTI  Radar 

The  objective  of  MTI  radar  systems  is  to  detect  the  pres¬ 
ence  of  moving  objects.  MTI  radars  on  airborne  platforms 
illuminate  a  scene  with  a  waveform  and  sample  the  return 
at  each  element  of  a  multi-element  array  (We  restrict  our 
attention  to  uniform  linear  arrays  (ULA).).  The  process  is 
repeated  during  a  coherent  processing  interval  (CPI).  Af¬ 
ter  pulse  compression,  the  data  is  organized  into  a  three- 
dimensional  CPI  datacube,  as  depicted  in  Figure  2,  that  is 
indexed  by  1 )  pulse  number,  2)  element  number,  and  3)  sam¬ 
ple  number  (range). 

At  each  range  value,  a  two-dimensional  function  locates 
the  presence  of  reflecting  objects  by  their  cone  angle  and 
their  corresponding  Doppler  frequency.  For  a  fixed  system, 
the  signal  strength  returned  by  a  target  depends  upon  its 
range  cross-section  (RCS)  value  and  its  range.  Stationary 
objects  will  yield  values  along  a  ’’clutter  ridge”,  whereas 
moving  objects  will  lie  off  the  ridge  by  an  amount  propor¬ 
tional  to  its  velocity  relative  to  the  platform.  A  moving  tar¬ 
get  is  most  visible  when  its  velocity  is  high  (so  as  to  move 
it  as  far  away  as  possible  from  the  clutter  ridge),  and  when 
it  returns  a  strong  signal. 


By  virtue  of  linearity,  MTI  radar  observes  a  signal  model 
similar  to  the  LMM  in  (2).  The  vector  signal  measured  by 
an  antenna  array  is  the  linear  superposition  of  reflections  re¬ 
ceived  from  all  directions,  and  when  a  target  is  present,  the 
corresponding  signal  is  given  by 

x  =  t  +  c  +  n.  (3) 

Here,  x  is  an  M  x  1  observation  vector,  where  M  is  the 
number  of  elements  on  the  ULA,  c  and  n  are  clutter  and 
noise,  respectively,  and  t  is  the  target  and  is  expressed  as 
t  =  av(<£,  /).  a  is  the  relative  amplitude  of  the  return 
signal,  and  v  is  the  steering  vector  which  is  related  to  the 
geometry  of  the  ULA  as  well  as  signal  parameters.  The 
entries  of  v  are  given  by: 

t,mn  =  eJ'2’r[(m"1)P^F+(n"1)fcos^1  (4) 

where  m  =  1, ...,  M,  is  the  element  number,  n  —  1, N , 
is  the  pulse  number,  <j>  is  the  azimuth  angle,  /  is  the  Doppler 
frequency,  A  is  the  wavelength,  d  is  the  array  element  spac¬ 
ing,  and  PRF  is  the  pulse  repetition  frequency.  Resolution 
in  MTI  radar  systems  is  driven  in  the  range  direction  by  the 
signal  bandwidth  of  the  interrogating  signal  and  by  the  aper¬ 
ture  length  in  azimuth. 

23.  Relationships  Between  HSI  and  MTI  Radar 

We  can  see  from  (2)  and  (3)  that  signal  models  for  HSI  and 
MTI  radar  are  quite  similar.  Both  sensors  organize  measure¬ 
ments  that  occupy  three  axes  (See  Figures  1  and  2).  Despite 
the  fact  that  HSI  is  passive  and  yields  non-negative  vector 
measurements,  and  MTI  radar  is  a  form  of  active  sensing 
producing  complex-values,  the  key  to  this  equivalence  is 
the  parallelism  between  end  members  and  steering  vectors 
as  well  as  RCS  and  fractional  abundances. 

In  (4),  v  is  a  vector  whose  structure  gives  rise  to  the 
complex- valued  signal  in  x.  When  a  target  is  in  motion  at 
a  specific  range,  us  location  m  azimuth  and  Doppler  fre¬ 
quency  decide  the  exact  value  of  the  steering  vector.  The 
one-dimensional  subspace  defined  by  the  target  vector,  t, 
varies  depending  on  the  location  and  speed  of  the  target.  In 
most  instances,  the  resolution  cell  size  is  sufficiently  small 
that  only  one  moving  target  resides  in  it.  In  the  case,  how¬ 
ever,  where  multiple  moving  targets  reside  in  a  single  cell, 
the  response  from  the  cell  will  be  the  sum  of  weighted  steer¬ 
ing  vectors,  each  having  their  own  Doppler  frequency.  The 
target  response,  t.  can  be  extended  to  include  P  targets,  so 
that  t  =  atv,  =  Va.  Compared  to  (2),  the  steer¬ 
ing  vectors  that  are  columns  of  V  are  analogous  to  the  end- 
members  in  S.  Further,  in  (4),  c  and  n  are  comparable  to 
the  background  and  additive  noise  in  (2),  and  their  statistics 
are  key  factors  in  the  detectors  for  each  sensor  type. 


3.  TARGET  DETECTION 

Based  on  the  comparable  signal  models  for  HSI  and  MTI 
radar  discussed  in  Section  2,  we  can  consider  strategies  for 
detection  in  each.  The  LMM  has  been  employed  in  numer¬ 
ous  circumstances  to  describe  the  mixing  process.  For  the 
purpose  of  target  detection,  it  is  capable  of  conveying  the 
mathematical  relationship  between  the  spectra  of  targets  and 
background.  By  virtue  of  the  LMM,  we  assume  that  all  pix¬ 
els  in  a  scene  imaged  by  a  hyperspectral  sensor  consist  of  at 
least  one  endmember  from  the  columns  of  S. 

A  specific  type  of  target  possesses  a  spectrum,  but  vari¬ 
ability  can  arise  due  to  many  factors,  including  changes  in 
observation  conditions.  Depending  on  its  source,  variabil¬ 
ity  can  be  accounted  for  by  adding  endmembers  (and  cor¬ 
responding  abundances)  to  describe  the  same  taiget  under 
different  conditions,  or  by  shaping  the  additive  noise,  n,  in 
the  LMM  to  reflect  statistical  variability.  St  denotes  the 
subset  of  columns  in  S  describing  targets,  and  S*  denotes 
background  endmembers.  Because  the  entries  of  S  are  non¬ 
negative,  St  and  S5  cannot  be  mutually  orthogonal  spaces, 
and  the  subspaces  they  span  necessarily  overlap. 

3.1.  Types  of  Hyperspectral  Detection 

The  task  of  detection  can  be  posed  for  two  separate  cir¬ 
cumstances  that  are  of  interest  in  hyperspectral  processing 

[1] .  The  Known  Target  detection  problem  occurs  when  the 
presence  of  a  specific  taiget  is  to  be  detected  amid  back¬ 
ground  and  noise,  and  St  is  known.  In  contrast  the  Un¬ 
known  Target  detection  problem  has  no  knowledge  of  a 
target  subspace,  but  attempts  to  detect  any  pixel  that  is  dif¬ 
ferent  from  the  background.  For  this  reason,  detectors  de¬ 
signed  for  this  goal  are  often  called  anomaly  detectors. 

The  class  of  Known  Target  detection  algorithms  can  be 
further  divided  into  two  categories.  The  set  of  structured 
background  algorithms  assumes  that  the  subspace  where 
the  background  resides,  S*>,  is  known  so  that  the  LMM  in 

(2)  can  be  re-written  as 

x  =  S*at  +  S&cq>  +  n  (5) 

Pt  Pb  +Pt 

=  ^a»s»+  53  aisi  +  n  (6) 

i=l  i=P-r+l 

where  P  =  Pt  +  Pjs-  Note  that  the  obstacles  to  perfect  de¬ 
tection,  background  and  additive  noise,  have  been  modelled 
as  two  distinct  entities,  and  n.  The  resulting  binary 
detection  test  for  structured  background  is 

Ho  :  x  =  S&cq>  +  n  (7) 

Hi  :  x  =  Staf  +  S&a&  +  n  (8) 

Alternatively,  if  the  background  endmembers  are  un¬ 
known,  the  sources  of  interference  cannot  be  separated  into 
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separate  background  and  noise  terms.  The  unstructured 
background  problem  lumps  all  non-target  pixel  contribu¬ 
tions  into  a  single  vector,  w,  and  the  resulting  binary  detec¬ 
tion  test  is  written  as: 


smaller  size  is  generated  from  a  local  neighborhood  around 
the  cell  being  processed.  Confining  the  covariance  to  a 
neighborhood  reduces  the  possibility  of  introducing  non¬ 
stationary  behavior  and  results  in  a  more  precise  estimate. 


Ho  :  x  =  w  (9) 

Hi:  x  =  Stat+w  (10) 

The  different  pairs  of  hypotheses  in  (7-8)  and  (9-10)  convey 
varying  levels  of  knowledge  about  the  detection  problem 
and  are  critical  to  the  formulation  of  optimal  detectors. 

When  the  size  of  a  target  is  expected  to  be  equal  to  or 
greater  in  size  than  that  of  a  pixel,  i.e.,  the  target  is  resolved, 
the  background  is  no  longer  present  in  either  hypothesis. 
This  is  a  significant  departure  from  radar  detection  models 
which  assume  an  additive  target  appears  in  addition  to  clut¬ 
ter.  A  replacement  target  displaces  some  amount,  or  all, 
of  the  environmental  interference,  or  background.  The  fact 
that  the  amount  of  background  displaced  by  a  target  in  a 
mixed  pixel  can  vary  means  that  the  statistics  of  the  inter¬ 
ference  will  also  vary.  As  a  consequence,  the  foremost  chal¬ 
lenge  in  the  design  of  optimal,  statistical  detectors  for  sub¬ 
pixel  targets  stems  from  the  uncertainty  of  what  fraction  of 
the  pixel  the  target  occupies. 

3  J2.  MTI  Detection 

Like  the  techniques  for  hyperspectral  detection,  algorithms 
in  MTI  radar  find  moving  targets  by  exposing  the  Doppler 
effect  in  signals  measured  by  a  ULA.  Just  as  the  subspaces 
defined  by  target  and  background  endmembers  in  HSI  de¬ 
tection  provide  the  basis  for  separating  target  and  background 
pixels,  the  geometry  of  the  array,  along  with  the  signal  pa¬ 
rameters,  are  combined  by  algorithms  to  maximize  the  vis¬ 
ibility  of  moving  targets. 

Algorithms  for  detecting  t  in  (3)  optimally  suppress  the 
presence  of  c  and  n  by  means  of  Space-Time  Adaptive  Pro¬ 
cessing  (STAP)  [2].  Resembling  the  detection  model  for  a 
known  target  in  an  unstructured  background,  the  binary  de¬ 
tection  model  for  a  moving  target,  t,  is  given  by 

H0  :  x  —  w  (11) 

Hi  :  x  =  t  +  w,  (12) 


33.  Relationship  Between  HSI  and  MTI  Detection 

As  noted  earlier,  reflectance  values  in  hyperspectral  pro¬ 
cessing  are  non-negative  and  no  greater  than  one,  and  un¬ 
like  the  intuition  from  radar,  targets  do  not  necessarily  in¬ 
duce  signals  of  greater  magnitude  than  background.  Rather, 
targets  are  discerned  from  background  reflectance  spectra 
primarily  by  their  shape,  and  detectors  exploit  the  differ¬ 
ences  in  spectral  shapes  represented  by  the  endmembers  to 
separate  targets  from  background. 

3.4.  Detectors 

We  have  shown  that  the  signal  models  for  hyperspectral  tar¬ 
get  detection  in  (7-8)  and  (9-10)  and  the  signal  model  for  the 
detection  of  moving  targets  in  (1 1 12)  are  simialr.  The  key  to 
the  parallelism  lies  in  the  similar  roles  played  by  endmem¬ 
bers  and  steering  vectors  and  the  equivalence  of  abundances 
and  RCS  values  and  is  further  driven  by  the  assumption  of 
linearity  when  combining  multiple  signals. 

By  (1 1-1 1),  detection  in  MTI  radar  compares  range-angle 
cells  to  a  threshold  to  determine  whether  or  not  a  target  is 
moving.  Numerous  detectors  have  been  proposed  to  per¬ 
form  this  comparison,  each  equipped  to  adaptively  optimize 
some  aspect  of  the  decision.  Notably,  the  most  desirable 
features  of  detectors  are:  1)  CFAR  (Constant  False  Alarm 
Rate),  2)  maximum  SNR,  and  3)  speed  of  computation.  A 
detector  might  be  able  to  assure  one  of  these  features,  at  the 
expense  of  maintaining  the  others,  and  the  trade-off  of  these 
qualities  is  instrumental  to  an  appropriate  implementation. 

The  same  set  of  circumstances  also  surrounds  hyper¬ 
spectral  detection.  A  taxonomy  of  hyperspectral  detectors 
for  both  the  known  and  unknown  target  case  hyperspectral 
data  appears  in  [1],  indicating  the  hierarchy  of  common  de¬ 
tectors.  The  Generalized  Likelihood  Ratio  Test  (GLRT)  [3] 
is  a  CFAR  detector  that  utilizes  the  unstructured  background 
signal  model,  and  for  a  single  known  target  spectrum,  s,  the 
GLRT  for  a  test  pixel  spectrum,  x,  is  given  by: 


where  w  =  c  +  n  in  (3). 

Moving  targets  may  be  present  at  any  range  and  azimuth 
position,  and  each  pixel  in  the  MTI  radar  datacube  is  a  can¬ 
didate  for  a  detection  test.  For  a  specific  range  value,  the 
cube  of  MTI  data  reduces  to  a  single  plane  having  MN 
resolution  cells.  It  is  well  known  that  the  detector  which 
maximizes  the  SNR  whitens  the  received  signal  based  on 
a  filter  derived  from  the  covariance  of  the  interference.  A 
covariance,  R^,  having  size  MN  x  MN  introduces  signif¬ 
icant  complications,  and,  most  often,  a  local  covariance  of  a 


Other  familiar  detectors  may  be  derived  directly  from  the 
GLRT  under  specific  circumstances,  such  as  the  Adaptive 
Matched  Filter  (AMF)  [3]  and  the  Adaptive  Coherence  Es¬ 
timator  (ACE)  [4,  5].  In  the  improbable  case  where  the  in¬ 
terference  covariance  is  the  identity  matrix,  the  ACE  sim¬ 
plifies  to  a  simple  cosine  measure  between  s  and  x,  often 
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Fig.  3.  Forest  Radiance  I  scene. 

referred  to  in  the  hyperspectral  processing  literature  as  the 
Spectral  Angle  Mapper  (SAM).  It  is  defined  as: 

^~SAM  (X)  —  r—= —  r~T~~ 

Vs1  svxJ  X 

With  no  incorporation  of  background  statistics,  clearly,  SAM 
cannot  be  CFAR  or  optimum  in  any  sense. 

4.  HYPERSPECTRAL  DETECTION  RESULTS 

Figure  3  displays  the  RGB  image  of  the  Forest  Radiance  I 
scene  imaged  by  the  (Hyperspectral  Digital  Imagery  Collec¬ 
tion  Experiment)  HYDICE  sensor.  The  data  collection  ac¬ 
quired  210  bands  of  spectral  data  in  spectral  bins  3  —  11  nm 
wide  ranging  from  399 — 2501  nm  (Visible  to  Shortwave  In¬ 
frared).  The  scene  consists  of  1280  lines  of  data,  each  hav¬ 
ing  320  samples  with  approximately  1  m  x  1  m  spatial  reso¬ 
lution.  Three  regions  of  distinct  background  type  have  been 
demarcated:  trees,  grass,  and  mixed.  In  addition,  a  sepa¬ 
rate  region  is  outlined  encompassing  several  vehicles  of  the 
same  type,  from  which  pure  target  pixels  are  derived.  Fig¬ 
ure  4(a)  illustrates  the  mean  target  spectrum  obtained  from 
37  pure  target  pixels. 

We  demonstrate  detection  with  hyperspectral  data  in  two 
different  experiments.  The  goal  of  the  first  experiment  is  to 
demonstrate  how  sub-pixel  targets  are  detected  when  they 
appear  mixed  with  background.  The  second  experiment 
considers  the  extreme  case  of  the  sub-pixel  target  problem 
when  the  target  is  resolved  and  obscures  all  background 
when  it  is  present.  For  both  experiments,  the  performance 
of  the  SAM  and  GLRT  detectors  is  compared  side-by-side. 


4.1.  Sub-pixel  Targets 

Sub-pixel  target  spectra  have  been  created  synthetically  by 
adding  the  pure  mean  target  spectrum  from  Figure  4(a)  in 
varying  proportions  to  the  8232  pure  tree  spectra  (back¬ 
ground)  in  Figure  3.  Although,  there  is  no  assurance  that 
spectra  mix  linearly  in  real  mixed  pixels,  we  have  employed 
this  assumption  for  our  investigation  until  accurate  sub-pixel 
target  data  and  ground  truth  become  available. 

We  have  estimated  the  background  covariance  from  the 
homogeneous  tree  spectra.  Both  detectors  yield  values  be¬ 
tween  0  (background)  and  1  (target),  and  pure  background 
detection  statistic  values  have  been  generated  from  the  8232 
tree  pixels.  An  equal  number  of  target  mixtures  resulted  by 
combining  the  same  background  pixels  with  the  mean  target 
vehicle  spectrum  in  25%/75%,  50%/50%,  and  75%/25% 
target/background  proportions.  The  range  of  detection  statis¬ 
tic  values  for  the  SAM  detector  appears  in  Figure  4(b)  and 
for  the  GLRT  in  Figure  4(c). 

In  Figures  4(b)  and  4(c),  the  regions  of  dark  blue  cor¬ 
respond  to  the  range  of  statistic  values  induced  by  the  tree 
pixels.  The  regions  of  red  correspond  to  the  range  of  tar¬ 
get  mixture  statistic  values.  Intervals  of  light  blue,  if  any, 
correspond  to  regions  where  test  statistics  from  pure  back¬ 
ground  pixels  and  sub-pixel  targets  overlap  and  indicate  pix¬ 
els  where  false  alarms  and  missed  detections  could  occur.  A 
white  strip  appears  at  the  value  of  the  mean  target/background 
statistic.  Regions  of  yellow  indicate  the  amount  of  separa¬ 
tion,  if  any,  between  target  and  background.  The  greater  the 
width  of  the  yellow  region,  the  better  the  detector  is  capable 
of  separating  sub-pixel  targets  from  pure  background. 


4.2.  Resolved  Targets 

The  37  target  pixels  in  Figure  3  are  fully  resolved,  and  they 
completely  obscure  any  background.  In  spite  of  the  fact  that 
there  is  no  background  to  whiten  when  the  target  is  present, 
and  using  the  37  target  pixels  and  the  8232  background  pix¬ 
els,  we  assessed  the  performance  of  the  SAM  and  GLRT  de¬ 
tectors  in  separating  pure  target  and  pure  background  spec¬ 
tra.  In  Figure  4(d),  both  the  SAM  and  GLRT  detection  re¬ 
sults  for  resolved  targets  are  depicted  side-by-side. 

43.  Discussion 

In  Figure  4(b)  the  SAM  detector  is  unable  to  successfully 
separate  every  sub-pixel  target  until  75%  of  the  pixel  is  oc¬ 
cupied  by  the  target.  This  is  not  surprising  since  SAM  does 
nothing  to  suppress  the  background.  On  the  other  hand, 
in  Figure  4(c),  the  GLRT  has  a  relatively  large  pure  back¬ 
ground  and  target/background  separation  even  when  the  tar¬ 
get  occupies  only  25%  of  the  pixel.  This  is  due  to  the  sup¬ 
pression  of  the  background  through  whitening  by  the  in- 
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(b) 


(d) 


Fig.  4.  (a)  Mean  target  spectrum;  (b)  Sub-pixel  detec¬ 
tion  statistics  for  SAM;  (c)  Sub-pixel  detection  statis¬ 
tics  for  GLRT;  (d)  Resolved  target  detection  results  for 
SAM  and  GLRT. 


verse  covariance,  R"1  in  (13).  For  resolved  targets.  Fig¬ 
ure  4(d)  confirms  that  the  effect  of  whitening  significantly 
improves  the  separability  of  target  and  background. 

In  both  experiments,  the  same  estimated  covariance  was 
used  regardless  of  the  percentage  of  background  present 
For  this  target  and  background,  the  results  in  Figure  4  show 
that,  even  when  the  background  covariance  is  mismatched 
to  the  amount  of  background  present,  the  performance  still 
exceeds  that  of  the  SAM  detector.  Proper  cancellation  of 
background  for  hyperspectral  detection  is  a  function  of  the 
percentage  of  background  present  as  well  as  the  relationship 
between  the  target  and  background  subspaces.  Based  on  the 
LMM,  this  relationship  will  be  key  for 

5.  CONCLUSION 

We  have  demonstrated  in  this  paper  that,  under  the  assump¬ 
tion  of  linear  mixing,  detection  in  hyperspectral  processing 
bears  significant  similarities  with  detection  in  MTI  radar. 
The  key  to  this  parallelism  is  the  analogous  relationship  be¬ 
tween  endmembers  and  steering  vectors  as  well  as  abun¬ 
dances  and  RCS  values.  Our  detection  results  indicate  that 
statistical  detectors  for  radar  can  be  adapted  to  hyperspectral 
signals  for  both  the  sub-pixel  and  resolved  target  problem, 
even  though  sub-pixel  targets  give  rise  to  replacement  target 
models.  Moreover,  future  work  will  continue  to  investigate 
methods  for  translating  the  optimalities  of  radar  detection  to 
the  hyperspectral  domain. 
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ABSTRACT 

The  problem  treated  in  this  paper  is  the  one  of  detection  and 
restoration  of  ship  wakes  in  Synthetic  Aperture  Radar  (SAR)  im¬ 
ages.  This  cannot  be  easily  made,  because  SAR  images  are  cor¬ 
rupted  by  a  granular  noise,  called  speckle  and  because  there  is 
no  information  about  the  direction  and  the  level  of  the  wake.  For 
these  reasons,  most  detection  algorithms  use  the  Radon  trans¬ 
form,  which  makes  square  a  straight  with  a  point  We  propose 
here  a  new  method,  based  on  the  manage  between  the  Radon 
transform  and  a  filtering  method  used  to  interpolate  the  image 
in  a  rotating  reference  system,  introduced  by  the  Radon  trans¬ 
form  theory.  This  filtering  technic  is  the  stochastic  matched  fil¬ 
tering  technic,  which  allows  to  maximize  the  signal  to  noise  ratio 
after  processing.  Experimental  results  on  SIR-C/X-SAR  images 
are  presented  and  compared  to  those  obtained  using  the  classical 
approach. 

1.  INTRODUCTION 

Recently,  a  great  deal  of  research  has  been  dedicated  to  ship 
wake  detection  in  Synthetic  Aperture  Radar  (SAR)  images.  In¬ 
deed,  it  is  well  known  that  SAR  images  are  able  to  show  ship 
wakes  as  lines  darker  or  sometimes  brighter  than  the  surround¬ 
ing  sea.  Most  of  the  detection  algorithms  use  the  Radon  transform 
[2,  7,  8].  Indeed,  when  an  image  contains  a  straight  line  or  a  seg¬ 
ment,  its  Radon  transform  exhibits  a  narrow  peak  if  the  line  is 
brighter  than  its  surroundings  and  a  trough  in  the  opposite  case. 
Thus,  the  problem  in  finding  lines  is  related  to  detect  these  peaks 
and  troughs  in  the  transform  domain.  Other  methods  use  the  detec¬ 
tion  on  both  ships  and  ship  wakes  [5].  Given  that  SAR  images  are 
affected  by  a  granular,  multiplicative  noise  (called  speckle),  most 
of  these  detection  algorithms  pre-filter  the  data  in  order  to  improve 
the  visibility  of  the  ship  wakes. 

We  know  that  the  application  of  the  Radon  transform  requires  the 
computation  of  interpolated  image  in  a  rotating  reference  system. 
In  this  paper,  we  propose  a  new  method  based  on  the  mariage  be¬ 
tween  the  Radon  transform  and  a  filtering  method.  We  use  this 
filtering  technic  to  compute  the  interpolations  of  the  SAR  image, 
in  order  to  estimate  properly  the  signal  of  interest  (the  ship  wake) 
in  the  rotating  reference  system.  This  processing  is  called  the 
stochastic  matched  filtering  technic  [1].  It  is  based  on  the  sig¬ 
nal  expansion  into  series  of  functions  with  uncorrelated  random 


variables  for  decomposition  coefficients.  This  corresponds  to  the 
Karhunen-Loeve  expansion  in  the  case  of  a  white  noise.  Because 
the  chosen  basis  functions  improve  the  signal  to  noise  ratio  after 
processing,  there  is  no  more  sinusoidal  curves  corresponding  to 
the  speckle  in  the  Radon  domain,  and  the  detection  of  the  peak  (or 
trough)  corresponding  to  the  ship  wake  is  improved. 

First  of  all,  we  recall  in  section  2  the  discrete  Radon  transform 
[4],  in  the  case  of  a  two-dimensional  signal.  Then,  we  present, 
in  section  3,  an  interpolation-filtering  method  for  noise  corrupted 
image  in  a  rotating  reference  system.  First,  we  recall  the  stochas¬ 
tic  matched  filtering  technic  and  then  we  describe  how  to  perform 
an  interpolation  based  on  this  method  and  using  the  discrete  co¬ 
sine  transform.  We  finish  this  section  with  the  explanation  of  the 
subimage  processing.  Next,  in  section  4,  we  propose  an  example 
of  application  of  our  processing  on  a  SIR-C/X-SAR  image,  which 
shows  a  moving  ship  and  its  dark  turbulent  wake.  We  finish  this 
article  with  a  comparison  of  our  results  with  those  obtained  by  the 
classical  approach,  which  uses  the  Radon  transform  based  on  the 
nearest  neighbor  interpolation. 

2.  THE  DISCRETE  RADON  TRANSFORM 

The  Radon  transform  on  Euclidean  space  was  first  established 
by  Johann  Radon  in  1917.  Nearly  half  a  century  after  Radon’s 
work,  the  Hough  transform  for  detecting  straight  lines  in  digital 
pictures  was  introduced.  But  this  transform  is  actually  a  special 
case  of  the  Radon  transform.  We  are  going  to  recall  in  this  section 
the  discrete  Radon  transform  of  a  two-dimensional  signal. 
Considering  an  image  I,  (M  -f  1)  x  ( M  -h  1)  pixels,  its  discrete 
Radon  transform,  I  is  expressed  by: 

M/2 

I  (xe,6)  =  ^  1  (xe  cos  8  —  y$  sin  9,  x$  sin  6  -I-  y$  cos  6) , 

y&=-M/2 

where  x&  and  y$  are  integers  which  are  bounded  by  — ^  and 
6  corresponds  to  the  rotation  angle  and  takes  values  between  0  and 

7 r. 

The  previous  equation  shows  that  the  computation  of  the  Radon 
transform  requires,  for  each  parameter  9 ,  the  calculation  of  the 
new  pixel  values  in  reference  system  Indeed,  the  new  coor¬ 
dinates  are  not  integer  values  and  so  are  not  corresponding  to  the 
native  mesh  of  the  image. 
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By  nature,  the  Radon  transform  accentuates  linear  features  in  an 
image  by  integrating  along  all  possible  lines.  The  result  is  that  an 
image  which  is  non  zero  in  a  single  point  (x0,  yo)  has  a  Radon 
transform  which  is  non  zero  along  a  sinusoidal  curve  of  equation 
xq  =  x0  cos  9  +  y0  sin  6.  The  phase  and  frequency  of  this  si¬ 
nusoidal  curve  depend  on  the  spatial  location  of  the  correspond¬ 
ing  point  in  the  original  image.  If  the  original  image  contains 
a  straight  line  or  a  segment,  its  Radon  transform  exhibits  a  nar¬ 
row  peak,  if  the  line  is  brighter  than  its  surrounding,  and  a  trough 
in  the  opposite  case.  The  coordinates  of  the  peak  (or  trough)  are 
{xe0  ?  Oo)  which  correspond  to  the  parameters  of  the  polar  equation 
x&0  =  x  cos  #o+y  sin  Bo  of  the  straight  line.  Thus,  the  problem  of 
finding  lines  is  reduced  to  the  detection  of  peaks  and  troughs  in  the 
transform  domain.  The  Radon  transform  is  particularly  suited  for 
finding  lines  in  a  noise-corrupted  image,  because  the  integration 
process  tends  to  cancel  out  intensity  fluctuations  due  to  the  noise. 
For  this  reason,  we  can  find  in  the  literature  several  applications  in 
the  domain  of  wakes  detection  (see  [2, 7,  8]  for  example). 

3.  INTERPOLATION-FILTERING  OF  A  ROTATING 
IMAGE 

We  have  seen  that  the  Radon  transform  of  an  image  implies, 
for  each  value  of  the  6  parameter,  the  computation  of  an  interpo¬ 
lated  image.  We  want  to  use  the  Radon  transform  in  order  to  detect 
ship  wakes  in  SAR  images.  Given  that  SAR  images  are  corrupted 
by  a  granular  noise,  called  speckle,  it  is  of  great  interest  to  take 
into  account  this  noise  to  compute  the  interpolation  of  such  an  im¬ 
age,  in  order  to  give  a  good  estimation  of  the  signal  of  interest 
in  the  rotating  reference  system.  For  this  reason,  we  present  here 
an  interpolation  method,  based  on  the  stochastic  matched  filtering 
method,  which  principle  is  to  expand  the  noise-corrupted  signal 
into  series  of  functions  with  uncorrelated  random  variables  for  de¬ 
composition  coefficients. 

3.1.  The  stochastic  matched  filtering  method 

Consider  a  two-dimensional  noise-corrupted  signal,  Z(x,  y), 
defined  over  D  =  [— T ;  T]  x  [— T ;  X].  This  one  corresponds  to  the 
superposition  of  a  signal  of  interest  S(x ,  y)  with  a  noise  B(x ,  y ): 

Z(x,y)  =  S(x,y)  +  B(x,y)  V(x,y)  6  D, 

where  S(x,  y)  and  B(x,  y)  are  assumed  to  be  independent  and 
stationary. 

We  want  to  expand  simultaneously  the  signal  of  interest  and  the 
noise  into  series  of  the  form: 


The  determination  of  these  random  variables  depends  on  the  choice 
of  the  set  of  deterministic  functions  {$n  ( x ,  y) }.  We  will  used  the 
set,  which  provides  the  uncorrelation  of  the  random  variables,  i.e.: 

f  =  E  $n,m 

X  E{bnbm}  =  E{bi}Sn,m. 

Now,  we  show  how  to  determine  these  functions  4>n(x,  y).  In  or¬ 
der  to  find  them,  let  us  consider  the  stochastic  matched  filtering 
technic  [1]. 

If  we  consider  a  deterministic,  stationary  two-dimensional  signal, 
called  S(x,  y),  which  is  defined  over  D,  corrupted  by  an  ergodic, 
stationary  noise  B(x>  y),  the  matched  filtering  technic  consists  in 
finding  a  function  $(x,  y),  defined  over  D,  in  order  to  maximize 
the  signal  to  noise  ratio  K,  expressed  by  the  following  relation: 

K_  l/Jp  S(x,y)<f>(x,y)dxdy\2 

E{\IlDB(x’y)^(x^y)dxdy\2} 

When  the  signal  is  not  deterministic,  but  a  random,  zero-mean, 
stationary,  two-dimensional  signal  S(x,y),  we  can  show  that  K 
can  be  explained  as  follows: 

E  {  I/Xd  b(x » y)$(*,  y)dxdy  |2 1 

Given  that  this  signal  to  noise  ratio  can  be  rewritten  as  the  ratio 
of  two  quadratic  forms,  it  appears  to  be  a  Rayleigh  quotient,  so  it 
will  be  maximized  if  $(r,  y)  is  the  two-dimensional  eigenfunc¬ 
tion  associated  to  the  maximal  eigenvalue  of  the  following  integral 
equation: 

ffDrSs(x  -x',y-  y')$„(x', y')dx'dy  = 

(2) 

//x>rss(x  —  x  ,y  —  y')$n(x',y')dx'dy', 

for  all  (x,  y)  €  D  and  where  Tss  and  F bb  represent  the  covari¬ 
ances  of  the  signal  and  of  the  noise,  respectively. 

Random  variables  sn  and  bn  are  uncorrelated,  when  the  $„(x,  y) 
functions  are  the  eigenfunctions  of  integral  equation  (2),  with  eigen¬ 
values  An  verifying: 

A  -£lf  H 
n_ 

When  eigenfunctions  $n(x,  y)  are  normalized  such  as  the  follow¬ 
ing  integral 


N 

S{x,y)  =  ^lim^  y^Sn'Mx.y) 

n= 1 
N 

B(x,y)  = 

n=l 


In  these  expressions,  ^n(x, y)  are  the  deterministic  linearly  in¬ 
dependent  basis  functions,  and  sn  and  bn  represent  zero-mean, 
random  variables  expressed  by  the  following  relations: 


Sn 


bn 


S(x,  y)$n(x,y)dxdy 
B(x7y)$n(x,y)dxdy. 


(1) 


I  Id  I  Id  ^bb^X  ~  y)$ri(x7y)$n(x7y)dxdxdydy 

takes  one  for  value,  we  can  show  that  functions  ^(x,  y)  are  ex¬ 
pressed  by: 

^n(x,y)  =  JJ^TBb(x  -x\y-  y)$n{x\y)dxdy.  (3) 

In  these  conditions  and  considering  the  Zn  random  variables,  ob¬ 
tained  by  projecting  functions  <£n(x,  y)  on  noise-corrupted  signal 
Z(x7  2/)*  we  can  show  that  the  use  of  the  following  expansion 

N 

Z(x,y)=  lim  Y"^^n(x,2/) 

N—y oo 
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corresponds  to  a  signal  to  noise  ratio  of  the  nth  component  of 
Z(x,  y)  expressed  by: 


where  a2sja\  is  the  signal  to  noise  ratio  before  processing. 

So  all  the  eigenfunctions  $n(x,y)  associated  to  eigenvalues  An 
greater  than  one  can  contribute  to  an  improvement  of  the  signal  to 
noise  ratio.  For  this  reason,  filtering  the  observed  signal  can  be 
made  by  keeping  all  the  components  with  a  signal  to  noise  ratio 
greater  to  a  certain  level,  anyhow  greater  than  1. 


32.  Interpolation  using  the  stochastic  matched  filtering  method 

To  compute  an  interpolation  based  on  the  stochastic  matched 
filtering  method,  the  basic  idea  is  to  expand  the  observed  signal 
and  then  to  restore  the  signal  of  interest  using  the  ^n(x,  y)  func¬ 
tions  previously  interpolated.  But  this  reasoning  presents  some 
defaults.  Indeed,  it  implies  an  heavy  CPU  budget  and  some  mem¬ 
ory  problems  may  appear.  For  these  reasons,  we  are  going  in  this 
section  to  propose  a  new  formulation  for  the  stochastic  matched 
filtering  method  by  using  the  discrete  cosine  transform. 

3.2.1.  Analytical  approximation  for  the  solutions  of  the  inte¬ 
gral  equation 

We  can  find  in  the  literature  several  works  based  on  the  stochas¬ 
tic  matched  filtering  technic  in  its  discrete  form  (see  [6]  for  exam¬ 
ple).  Unfortunately,  when  we  consider  the  discrete  form  of  in¬ 
tegral  equation  (2),  the  eigenvectors  solution  of  this  generalized 
eigenvalue  problem  are  linked  to  the  native  increment  of  the  im¬ 
age  and  could  cause  problems  for  image  interpolation.  So,  we  do 
not  consider  the  discrete  relation  but  the  continuous  one  to  find  the 
$n  (x,  y)  functions. 

Considering  the  DCT1  coefficients  and  9  of 

functions  (x, y) ,  VSs (x - x\  y - y')  and  TBb (x  -  x' ,  y  -  y') , 
we  can  show  that  solving  integral  equation  (2)  becomes  equivalent 
to  solving  the  following  linear  system: 


3.2.2.  Interpolation-filtering  method 

We  propose  in  this  subsection  a  new  formulation  of  the  stochas¬ 
tic  matched  filtering  method  by  using  the  discrete  cosine  trans¬ 
form.  In  these  conditions,  we  are  looking  for  the  expression  of  the 
coefficients  of  the  filtered  signal  expanded  into  cosine  series;  we 
shall  reconstruct  the  approximation  of  the  restored  signal  in  the  fi¬ 
nal  phase  of  the  processing. 

Let  Z(x ,  y)  be  the  observed  signal  to  be  expanded  and  let  Z(x,  y) 
be  the  reconstructed  filtered  signal.  We  have: 

Q 

Z{x,y)  =  y^zn^n(x,y)  V(x,t/)  €  D,  (5) 

71  =  1 ' 

where  Q  is  chosen  such  as  Xq  is  greater  to  a  certain  threshold, 
anyhow  greater  than  1. 

In  the  last  relation,  zn  are  the  random  variables  to  be  determined 
from  the  input  data: 

Zn  =  JJ  Z(x,y)$n(x,y)dxdy.  (6) 

We  have  seen  that  these  random  variables  are  uncorrelated  when 
functions  <£n(x,  y)  are  the  eigenfunctions  of  integral  equation  (2). 
The  \&n  (x,  y)  basis  functions  are  obtained  by  projecting  functions 
$n(x,  y)  on  the  noise  covariance  as  described  in  relation  (3). 

First,  we  modify  this  relation,  in  order  to  express  the  /?£><7  coeffi¬ 
cients  of  the  *Pn(x,  y)  cosine  series.  It  comes: 

Nf  Nf 

(7) 

fc=0  1=0 

for (p, q)  =  0,  1,  Nf. 

In  like  manner,  from  expansion  (5),  we  obtain  for  expression  of 
the  $k,i  DCT  coefficients  of  restored  signal  Z(x,  y): 

Q 

$W  =  £>#,<•  (8) 

n=l 


Nf  Nf  Nf  Nf 

=  *»  E  E  w 


To  end,  we  have  to  explain  the  Zn  coefficients  in  terms  of  the  co¬ 
efficients  of  the  observed  signal  and  of  the  eigenfunctions.  We  can 
show  that  relation  (6)  is  equal  to: 


where  (Nf  +  1)  corresponds  to  the  number  of  DCT  coefficients 
taken  account  and  is  high  enough  to  ensure  the  uniform  conver¬ 
gence  of  the  series  to  their  respective  functions. 

Finally,  the  analytical  approximation  4>n(x,y)  of  the  $n(x,y) 
functions  solution  of  the  integral  equation,  is  obtained  with  the 
following  relation,  for  (x,  y)  €  D: 


Nf  Nf 

$n(x,V)  =  EE“*.lC0S 


fc=0  1=0 


This  new  method  for  finding  an  analytical  approximation  for  the 
solutions  of  the  integral  equation  has  been  quantified  and  com¬ 
pared  to  the  classical  approach,  in  the  case  of  the  Fredholm  inte¬ 
gral  equation,  that  is  when  the  noise  covariance  describes  a  white 
noise  [3]. 

1  Discrete  Cosine  Transform 


Nf  Nf 

rn^EEVM.  (9) 

p=0  q—0 

where  represent  the  Z(x,  y)  DCT  coefficients. 

Considering  the  9  rotation  angle,  it  is  possible  to  compute  the 
interpolated-filtered  signal  using  the  following  relation: 

rr  {  \  o'  ( ttk(xe  —  T)  ( 7T l(y$  —  T) 

Ze(xe,  ye)  =  2^2^  cos  ( - 2 T - )  cos  ( - 2 f - 

where  (x$,  ye)  represents  the  new  coordinates  of  the  pixels  in  ro¬ 
tating  reference  system  Sffo. 

With  such  a  formulation,  it  is  possible  to  use  the  stochastic  matched 
filtering  method  for  image  interpolation  in  a  very  short  time,  be¬ 
cause  all  the  computations  can  be  made  by  using  only  the  algo¬ 
rithm  of  fast  discrete  cosine  transform. 
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3.3.  Subimages  processing 


interest  S(x,  y).  We  can  show: 


To  apply  the  stochastic  matched  filtering  method,  it  is  neces¬ 
sary  to  respect  the  stationary  condition  for  the  signal  and  the  noise. 
We  know  that  we  cannot  consider  an  image  as  a  realization  of  a 
stationary  process.  But  after  segmentation,  we  can  define  several 
areas  representative  of  a  texture.  So  a  particular  area  of  the  im¬ 
age  (a  subimage)  can  be  considered  as  a  stationary  process.  For 
this  reason,  we  are  going  to  cover  the  noise-corrupted  image  with 
subimages.  The  subimage  size  is  chosen  such  as  the  subimages  are 
assumed  to  be  a  texture.  For  each  angle  0,  we  apply  the  proposed 
processing  on  subimages,  with  M  x  M  pixels.  Scanning  all  the 
image  allows  a  complete  processing. 


Figure  1:  Subimages  processing  showing  the  overlapping  between 
adjacent  subimages 

When  we  compute  the  interpolation  of  a  subimage,  we  find  an  edge 
effect  Indeed,  in  rotating  reference  system  the  coordinates  of 
the  pixels  localized  on  the  edge  of  the  subimage  do  not  depend  on 
the  native  subimage  in  reference  system  ft.  This  edge  effect  being 
maximal  for  angle  6  equal  to  the  subimage  size  after  process¬ 
ing  is  iV  x  JV,  with  N  equal  to  To  limit  these  edge  effects, 
it  is  necessary  to  segment  the  native  image  to  obtain  subimages 
which  overlap,  as  shown  in  figure  1.  The  gray  areas  correspond  to 
the  superposition  of  subimages,  d  and  h  coefficients  represent  the 
distance  between  the  center  of  two  adjacent  subimages  (in  line  or 
row)  and  the  width  of  the  overlapping  area  respectively.  We  have: 

j  N  j  7  Nsmd 

d  = - r— — : — -  and  h  =  — — - -  . 

cos  0  +  sin  9  sin  9  +  cos  6 

We  now  apply  the  interpolation-filtering  method,  proposed  in  the 
previous  section,  to  each  zero-mean  subimage.  Assuming  that  the 
noise  is  high-frequency  compared  to  the  signal,  when  we  apply 
this  processing  to  the  whole  noise-corrupted  image  with  the  same 
number  Q  of  basis  functions  for  each  subimage,  the  resulting  im¬ 
age  may  be  smoothed  or  still  noise-corrupted.  Indeed,  the  signal  to 
noise  ratio  is  not  the  same  for  each  area  of  the  image.  For  this  rea¬ 
son,  we  are  going  to  process  each  subimage,  with  different  number 
Q  of  basis  functions.  To  find  this  number,  let  us  consider  mean 
square  error  e  between  reconstructed  signal  Z(x,  y)  and  signal  of 


,  Q  Nf  Nf 

n=l  fc=0  1=0 

So,  for  each  subimage,  we  compute  e  for  different  values  of  param¬ 
eter  Q  0 Q  being  in  the  interval  [1;  such  as  A Qma*  greater 

than  1).  We  only  keep  the  Q  walues,  which  minimize  the  mean 
square  error.  The  noise  power,  a%,  will  be  computed  in  a  homo¬ 
geneous  area  of  the  whole  noise-corrupted  image  and  the  signal 
power,  cr§,  will  be  estimated  in  the  subimage  to  be  processed. 
Considering  now  the  problem  posed  by  the  overlapping  areas,  the 
corresponding  processed  pixels  will  be  computed  by  averaging 
each  pixel  having  the  same  position  in  the  overlapping  subimage. 


4.  SHIP  WAKES  DETECTION 

To  illustrate  our  processing,  we  have  chosen  to  apply  it  to 
an  image  acquired  by  the  Spacebome  Imaging  Radar-C/X-band 
Synthetic  Aperture  Radar  (SIR-C/X-SAR),  which  shows  a  moving 
ship  and  its  dark  turbulent  wake.  This  image  is  presented  figure  2. 


Figure  2:  SIR-C/X-SAR  image  (698  x  698  pixels) 


The  image  size  is  698  x  698  pixels.  The  number  of  gray  levels  is 
256  (0:  black,  255:  white).  The  dark  patches  in  the  upper  right 
of  this  image  correspond  to  smooth  areas  of  low  wind.  The  ship’s 
wake  is  about  28  kilometers  (17  miles)  long  in  this  image  and  in¬ 
vestigators  believe  that  may  reveal  that  the  ship  is  discharging  oil. 
Classically,  to  quantify  the  perturbation  level  of  a  SAR  image,  we 
determine  its  speckle  level.  This  one  is  obtained  by  computing 
the  variation  coefficient  (C  in  the  following),  obtained  on  several 
homogeneous  areas  of  the  image.  Let  W  be  the  number  of  homo¬ 
geneous  areas  In ,  we  have: 


C  = 


Vv. 

£{/«}' 


For  the  studied  image,  the  variation  coefficient  is  equal  to  0.277. 
We  have  now  enough  information  about  the  studied  image  to  pro¬ 
cess  it 


4.1.  Signal  and  noise  auto-correlation  functions 

We  have  seen  that  the  interpolation-filtering  method,  presented 
in  section  3,  requires  the  a  priori  knowledge  of  the  signal  and  the 
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noise  auto-correlation  functions.  In  order  not  to  favor  any  particu¬ 
lar  wake  orientation,  we  have  chosen  to  represent  the  signal  auto¬ 
correlation  function  by  an  isotropic  model.  This  one  results,  for 
different  0  values,  from  the  averaging  of  several  auto-correlation 
function  computations  of  a  two  colors  straight  line  placed  in  a  ro¬ 
tating  reference  system.  This  model  is  presented  figure  3.a. 


Figure  3:  Normalized  model  for  the  signal  and  noise  auto¬ 
correlation  functions  ( T  =  1) 

The  noise  auto-correlation  function  is  presented  in  figure  3.b.  This 
model  has  been  obtained  by  averaging  several  realizations  of  noise 
auto-correlation  functions  computed  in  some  homogeneous  areas 
of  the  native  image. 

42.  Radon  domain  and  wakes  restoration 

After  zero-meaning  the  image  presented  figure  2,  we  have  pro¬ 
cessed  it  with  a  subimage  size  equal  to  17  x  17  pixels,  to  respect 
the  coherence  length  of  the  noise. 

We  present  in  figure  4  the  interpolated-filtered  image  for  angle  9 
equal  to  35°.  For  this  image,  number  Q  of  basis  functions  is  in¬ 
cluded  between  1  and  13  depending  on  the  native  signal  to  noise 
ratio  of  the  subimage  to  be  processed  (near  13  basis  functions  for 
the  restauration  of  the  wake  and  1  basis  function  for  the  rest  of  the 
image). 


Figure  4:  SIR-C/X-SAR  image  in  reference  system  $35,  (973  x 
973  pixels) 

Analyzing  this  figure,  we  see  that  the  proposed  processing  allows 


a  great  reduction  of  the  speckle  and  a  good  restitution  of  the  signal 
of  interest  (die  wake).  Indeed,  there  is  no  more  dark  patches  and 
the  variation  coefficient  is  now  equal  to  0.016,  so  there  is  an  im¬ 
provement  by  a  factor  of  18  of  the  speckle  level. 

We  present  in  figure  5  the  resulted  image  in  the  Radon  domain. 


Figure  5:  Radon  domain  obtained  using  the  interpolation-filtering 
method 

In  the  transform  domain,  the  vertical  axis  represents  the  orienta¬ 
tion  of  each  integration  line,  while  the  horizontal  axis  represents 
the  distance  of  each  line  from  the  center  of  the  image.  The  trough 
corresponding  to  the  wake  is  clearly  evident  Its  vertical  position 
is  near  51°  and  corresponds  to  the  orientation  of  the  wake.  Fur¬ 
thermore,  several  sinusoidal  curves  of  poor  amplitude  regarding  to 
the  trough  amplitude  are  visible  in  the  Radon  domain  and  corre¬ 
spond  to  the  Radon  transform  of  the  residual  perturbations  after 
interpolation-filtering. 

From  this  transform  domain,  we  have  used  the  inverse  Radon  trans¬ 
form  to  find  the  location  of  the  wake  in  the  spatial  domain.  This 
inverse  transform  has  been  applied  to  the  image,  presented  in  fig¬ 
ure  5,  before  raised  to  the  power  of  three  in  order  to  improve  the 
amplitude  of  the  trough  in  regards  to  the  rest  of  the  image.  The 
different  interpolations  have  been  made  using  a  nearest  neighbor 
interpolation.  We  present  in  figure  6  the  resulted  image. 


Figure  6:  Restored  wake  from  figure  5  (696  x  696  pixels) 

The  resulted  image  shows  that  our  processing  allows  a  great  im- 
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provement  of  the  wake  readability.  All  the  disruptive  pixels  have 
disappeared.  The  variation  coefficient  for  this  image  is  equal  to 
0.002  compared  to  0.277  for  the  original  image. 


remark  a  lot  of  perturbations,  mainly  due  to  the  presence  of  the 
dark  patches.  The  variation  coefficient  for  this  image  is  equal  to 
0.145  and  the  improvement  is  only  a  factor  of  2. 


5.  COMPARISON  WITH  CLASSICAL  PROCESSING 


6.  CONCLUSIONS 


We  present  figure  7  the  transform  domain  of  the  SIR-C/X- 
SAR  image.  Each  image  in  the  rotating  reference  system  has  been 
computed  using  a  nearest  neighbor  interpolation. 


Figure  7:  Radon  domain  obtained  using  the  classical  approach 


We  have  presented  in  this  paper  a  new  processing  which  allows 
ship  wakes  detection  in  S  AR  images.  This  processing  is  based  on 
the  computation  of  the  SAR  image  Radon  transform.  The  original 
contribution  of  this  work,  compared  to  the  classical  approaches 
in  this  domain  consists  in  taking  into  account  the  noise  for  the 
image  interpolation  in  the  rotating  reference  system.  This  allows 
the  perturbations  to  have  a  lower  impact  in  the  transform  domain, 
the  corresponding  sinusoidal  curves  having  an  amplitude  smaller 
than  the  peak  or  trough  characteristic  of  the  wake.  We  have  com¬ 
pared  the  transform  domain  and  the  restored  wake  obtained  by  our 
processing  on  SAR  images,  with  those  obtained  with  the  classical 
processing.  In  all  cases,  our  processing  presents  far  better  results. 
With  our  processing,  the  probability  of  false  alarm  or  no  detection 
is  lower  than  with  the  classical  approach,  because  only  the  signal 
of  interest  is  considered. 

An  important  drawback  of  the  Radon  transform  is  that  it  is  global 
by  nature,  so  this  transform  cannot  tell  the  difference  between  long 
and  short  straight  lines.  For  this  reason,  future  work  in  this  domain 
concerns  the  application  of  the  proposed  interpolation  method  for 
the  computation  of  the  localized  Radon  transform  [2],  which  al¬ 
lows  to  localize  the  beginning  of  the  wake. 
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Abstract 

Space-time  adaptive  processing  (STAP)  is  two- 
dimensional  adaptive  filtering  employed  for  the  purpose 
of  clutter  cancellation  to  enable  the  detection  of  moving 
targets.  It  has  been  a  major  focus  of  research  activity  in 
radar  applications  for  which  the  platform  is  in  motion, 
e.g.,  airborne  or  space-based  systems.  In  this  setting,  an 
antenna  sensor  array  provides  spatial  discrimination, 
while  a  series  of  time  returns  or  pulses  form  a  synthetic 
array  that  provide  Doppler  (velocity)  discrimination. 

The  application  of  STAP  for  the  mobile  towed-array 
sonar  system  is  non-trivial  because  of  the  complex  multi¬ 
paths  in  the  underwater  environment  On  the  other  hand, 
Matched-field  processing  (MFP)  that  uses  a  propagation 
code  to  predict  the  complex  multi-path  structure  and 
coherently  combines  it  to  provide  range/depth 
discrimination  has  been  studied  and  demonstrated.  MFP 
with  a  synthetic  array  (a  series  of  snapshots)  to  estimate 
the  source  velocity  and  localize  source  in  range  and  depth 
has  also  been  demonstrated {1). 

STAMP  combines  the  adjacent-filter  beamspace  post- 
Doppler  STAP  ®  and  MFP  to  provide  improved 
performance  for  die  mobile  multi-line-towed-array  sonar 
applications.  The  processing  scheme  includes: 
transforming  phone  time  snapshots  into  frequency 
domain,  at  each  frequency  bin  forming  horizontal  beams 
in  the  directions  of  interest  for  each  towed  line,  then 
combining  signals  from  multi-towed-lines  and  adjacent 
Doppler  bins  and  beams  that  cover  the  multi-path  Doppler 
spread  due  to  motion  using  adaptive  MFP.  A  study  of 
STAMP  performance  in  the  towed-array  forward-looking 
problem  will  be  discussed.  In  this  problem,  the  own -ship 
signal  and  its  bottom  scattered  energy  can  be  treated  as 
stationary  interference  with  a  moving  target  at  constant 
speed  within  processing  interval  of  a  few  minutes. 

1.  Introduction 

Element-space  pre-Doppler  STAP®  is  two- 
dimensional  fully  adaptive  processing  that  coherently 
combines  the  signals  from  the  elements  of  an  array  and 
the  multiple  snapshots  of  coherent  signals,  to  obtain  large 
spatial  and  temporal  signal  gain,  to  suppress  interference, 
and  to  provide  target  detection  in  azimuth  and  velocity. 
Computational  complexity  and  the  need  to  estimate  the 


interference  from  limited  snapshots  make  it  impractical. 
The  adjacent-filter  beamspace  post-Doppler  STAP  is  a 
reduced-dimension  partially  adaptive  approach.  It 
performs  a  Doppler  filtering  with  a  temporal  Fourier 
transform  and  a  spatial  filtering  with  the  conventional 
beamforming  before  adaptive  processing.  The  adaptive 
processing  is  done  in  a  selected  sub-space  including  a  few 
beams  and  a  few  Doppler  bins. 

In  the  complex  multi-path  underwater  environment, 
the  signal  will  spread  over  many  beams  (especially  when 
the  array  is  steered  away  from  broadside)  and  over  many 
Doppler  bins  if  a  long  estimation  time  is  used.  Without 
combining  these  bins  a  processor  will  encounter  severe 
signal  degradation.  STAMP  is  different  from  the 
beamspace  post-Doppler  STAP  in  that  it  uses  a 
propagation  code  to  model  the  signal  spread  over  beam 
and  Doppler  bins  and  coherently  combines  them.  This 
new  approach  should  provide  improvement  in  signal 
estimation,  while  providing  range  and  depth  localization. 

Single -element  pre-Doppler  space-time  MFP  had  been 
reported  in  ref.(l).  In  this  work,  we  will  study  the 
performance  of  the  beamspace  post-Doppler  space-time 
adaptive  MFP  through  a  simulation.  In  section  2,  we  will 
describe  the  STAMP  processing  and  the  simulation 
scenario  for  the  forward-sector  processing.  The 
simulation  results  will  be  discussed  in  section  3,  and  a 
summary  will  be  given  in  section  4. 

2.  STAMP  processing  and  Forward-Sector  Processing 
Simulation  Geometry 

Figure  1  shows  the  STAMP  processing  diagram  for  a 
multirline  array.  It  starts  with  the  Fourier  transform  of 
phone  time  series  *](t)  into  frequency  domain  Xi(f)> 
Xk(f)=pCki(f)  ...  Xkn(f)]  where  k  is  the  line  index  and  1  is 
the  phone  index.  A  conventional  beamforming  response 
bk(f,0)  then  is  calculated  at  each  frequency  bin  for  each 
towed  line.  A  long  beam-space  vector  B(f)  is  formed 
with  beam  responses  at  selected  beams  and  Doppler  bins 
from  all  towed  lines.  The  covariance  matrix  R  is  formed 
by  the  outer  product  of  B(f)  and  ensemble  averaged  over 
a  wide  Doppler  band.  For  MFP,  replicas  are  generated 
with  a  propagation  code  and  passed  through  the  same 
Doppler  processing  and  conventional  beamforming,  then 
forming  the  beam-space  replicas.  The  adaptive  weight 
vectors  are  calculated  with  the  wide-band  covariance 
matrix  R  and  the  beam-space  replicas,  then  applied  on 
each  B(f)  to  get  the  adaptive  narrowband  response.  It  is 
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Figure  1 :  Space-Time  Adaptive  Matched-field  Processing  (STAMP) 


Figure  2:  Wideband-Narrowband  (WB/NB)  Feedback-Loop  White-Noise-Constrained  (FLWNQ  adaptive  processing 


noted  that  STAMP  will  be  the  same  as  conventional 
STAP  when  one  replaces  the  propagation  code  with  a 
plane-wave  signal  model. 

Figure  2  shows  the  processing  diagram  of  wideband- 
narrowband  (WB/NB)  Feedback-Loop  White-Noise- 
Constrained  (FLWNC)  ^  adaptive  processing.  At  each 
search  cell,  FLWNC  iteratively  adjusts  the  additive  white 
noise  until  the  white  noise  processing  gain  |w|2  falls 
within  the  constraints  6j  and  The  calculated  adaptive 
weight  then  is  used  to  filter  snapshots  at  each  Doppler 
bin.  This  is  called  wideband-narrowband  processing 
because  the  weight  is  calculated  with  the  covariance 
matrix  that  is  ensemble  averaged  over  a  broader  Doppler 
band  and  then  it  is  applied  to  narrowband  snapshots  at 
each  Doppler  bin. 


Figure  3  shows  the  simulation  geometry  of  forward- 
sector  processing.  The  own -ship  noise  and  its  bottom 
bounce  energy  are  treated  as  stationary  broadband  point- 
interference.  The  target  at  90  m  h  depth  broadcasts  a 
narrowband  signal  and  moves  toward  the  tow  ship  with  a 
relative  speed  of  6  kts.  In  the  simulations,  three  array 
configurations  were  considered:  single -Line,  4-Line- 
Sequential,  and  4-Line-vertical.  Each  single -Line  consists 
of  48  phones  with  a  spacing  of  2.25  m.  The  arrays  are  at 
a  nominal  depth  of  90  m.  The  4-Line-Sequential 
configuration  connects  four  single-lines  to  form  a  long 
line.  The  4-Line-Vertical  configuration  stacks  4  single - 
lines  vertically  with  a  vertical  spacing  of  10  m. 
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Figure  3:  Simulation  geometry,  F=200  Hz,  target(NB)=120  dB,  own-ship(BB)=120  dB,  bottom  bounce  (BB)=115  dB, 
white  NL=120  dB,  0.1  random  phase  error,  no  environmental  mismatch. 


3.  Simulation  Results 

From  the  conventional  plane-wave  beamforming  of  a 
single -Line,  Figures  4  and  5  show  beam/time  responses 
(BTRs)  and  beam/Doppler  responses  of  each  signal 
component,  respectively.  The  own -ship  and  the  bottom 
interference  arrive  at  relatively  higher  angles  away  from 
the  forward  endfire  at  CP.  The  target  component  will  be 
buried  underneath  the  own -ship  interference  in  the 
combined  BTR,  but  with  256-sec  integration  time,  it 
begins  to  separate  from  own -ship  noise  in  the 
beam/Doppler  response.  The  narrowband  target  signal  is 
spread  in  Doppler  and  azimuth  due  to  multi-paths  that 
can  be  coherently  combined  with  MFP  to  enhance 
detection  and  localization.  This  is  the  motivation  of  the 
STAMP  study. 

The  top  two  panels  in  Figure  6  show  the  plane-wave 
beam  spectrograms  for  single -Line  steered  at  10°  off  the 
forward  endfire.  The  high-angle  own-ship  noise  leaks 
into  this  shallow  angle  and  causes  the  high  noise 
background  in  the  conventional  beam  spectrogram,  but  is 
significant  suppressed  by  the  adaptive  processing.  The 
bottom  left  panel  shows  the  STAMP  track-cell-gram  that 
tracks  the  target  location  and  the  bottom  right  panel 
shows  the  maximum  response  over  Doppler.  The 
STAMP  uses  beams  of  (f  to  30°  and  6  Doppler  bins  for 
6-kt  search.  It  is  noted  that  STAMP  processing  provides 
2-3  dB  more  signal  gain  than  the  plane-wave  processing 
for  single -Line  and  provides  8-9  dB  more  with  4-Line- 
Vertical  array. 


Figure  7  shows  the  range  tracking  performance  of  the 
STAMP.  In  the  simulation  the  target  starts  at  10  km  and 
moves  toward  the  towed  ship.  With  single -Line,  the 
conventional  MFP  does  not  provide  range  discrimination 
of  the  target.  With  adaptive  MFP,  single -Line  STAMP 
starts  to  show  the  target  track  that  is  closing  in  range. 
The  4-Line  configurations  help  to  suppress  the  range 
sidelobes,  and  the  4-Line-Vertical  array  provides  a  better 
performance  than  the  4-Line-Sequential  array. 

Figure  8  shows  depth  discrimination  of  STAMP  range 
tracking  with  the  4-Line-Vertical  array.  The  target  track 
is  formed  only  at  the  target  depth  of  90  meters.  The 
target-related  cascaded  sidelobes  are  seen  at  other  depths. 
Similarly,  Figure  9  shows  speed  discrimination  of 
STAMP  range  tracking  with  the  4Line-Vertical  array. 
The  target  track  is  formed  at  the  target  speed  of  3  m/s. 
Away  from  the  target  speed,  the  track  becomes  defocused 
and  only  target-related  cascaded  sidelobes  are  seen  at 
search  speeds  far  away  from  the  target  speed. 

4.  Summary 

STAMP  processing  that  combines  STAP  and  MFP  has 
been  developed.  Simulations  show  that  STAMP 
coherently  combines  signal  multi-path  spread  in  azimuth 
and  Doppler  and  greatly  enhances  the  target  detection  as 
well  as  providing  target  range  and  depth  classification  and 
localization.  In  a  future  study,  we  will  address  how 
robust  STAMP  is  against  array  shape  error,  frequency 
mismatch,  and  environmental  mismatch  as  well  as  how 
STAMP  performs  in  other  tactical  scenarios. 


87 


Azimuth  (deg)  Azimuth  {deg)  s  Azimuth  (deg)  Azimuth  (deg) 


Selected 

Demsl 

O-oppl^rs 


OwrMtfMp 


Doppler  (Hz)  Doppler  (Hz) ' 

Figure  5:  Single-Line  Doppler/ Azimuth  responses  of  each  signal  component,  256-sec  integration  time. 
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Figure  7:  Array-size  dependence  of  MFP  range  tracking  search  at  target  depth  and  target  speed. 
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Figure  9:  Speed  discrimination  of  adaptive  MFP  range  tracking,  4 -Line-Vertical  array  search  at  target  depth. 
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ABSTRACT 

We  examine  the  problem  of  passive  localization  of  a  mov¬ 
ing  target  in  a  littoral  environment,  based  on  its  depth  and 
range-rate.  We  compare  performance  with  the  conven¬ 
tional  matched  Jield  processor,  which  localizes  in  depth 
and  range.  Range-rate  localization  is  more  robust  with  re¬ 
spect  to  uncertainties  in  the  environment,  and  with  respect 
to  associated  uncertainties  in  the  horizontal  wave  num¬ 
bers  of  the  channel  modes  used  for  the  matched  field  tar¬ 
get  response.  In  our  approach  the  complex  amplitudes  of 
the  modes  are  treated  as  nuisance  parameters,  which  com¬ 
prise  a  hidden,  first-order  Markov  state  process.  In  lieu  of 
an  analytic  expression  for  the  evolution  of  the  likelihood 
function  as  new  snapshots  are  integrated,  we  evaluate  a 
method  of  particle  filtering,  or  sequential  resampling. 

1.  INTRODUCTION 

Matched  field  processing  (MFP)  techniques  localize  targets 
in  shallow  water  environments  by  computing  a  replica  vec¬ 
tor  based  on  channel  modes  associated  with  a  given  set  of 
environmental  parameters,  including  the  sound-speed  pro¬ 
file  [3].  They  typically  suffer  from  high  sidelobes  and  am¬ 
biguous  peaks  produced  at  ambiguous  ranges  and  depths, 
a  problem  that  is  exacerbated  by  environmental  uncertain¬ 
ties.  Modifications  to  the  MVDR  beamformer  have  been 
proposed  to  make  it  more  robust  to  these  uncertainties,  by 
constraining  the  weight  vector  to  stablilize  its  response  over 
an  ensemble  of  environments  [1],  An  additional  problem 
is  target  motion,  which  spreads  the  target  peak,  decreasing 
its  visibility.  Previous  work  on  target  motion  has  focused 
on  applying  a  transformations  to  successive  data  snapshots 
that  compensate  for  motion  corresponding  to  a  particular 
hypothesized  velocity,  resulting  in  a  focused  peak  in  the 
range-depth  ambiguity  surface  for  a  target  having  that  ve¬ 
locity  [2]  .  The  main  idea  of  this  paper  is  to  view  target 
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motion  as  an  asset  rather  than  a  liability,  and  to  jointly  es¬ 
timate  depth  and  range-rate  in  a  manner  that  not  only  com¬ 
pensates  for  target  motion,  but  also  enhances  robustness  to 
environmental  uncertainty.  We  propose  to  implement  this 
by  constructing  a  state  model  for  the  replica  vector,  using  an 
assumed  target  velocity  to  constrain  the  state  evolution,  and 
leaving  the  initial  state,  which  depends  on  target  range,  as  a 
nuisance  parameter.  Because  we  do  not  have  a  closed-form, 
analytic  solution  for  the  updating  of  the  likelihood  function 
that  arises  from  this  state  model,  we  instead  examine  a  non- 
parametric  method  of  approximating  the  likelihood,  a  se¬ 
quential  resampling  or  “particle  filtering”  method  [4,  5]. 


2.  MATCHED  FIELD  PROCESSING 


Matched  field  processing  obtains  a  replica  vector  for  a  target 
in  shallow  water  based  on  the  Green’s  function  for  the  target 
response.  For  a  shallow  water  environment,  the  response  at 
the  nth  sensor  can  be  expanded  in  terms  of  the  eigenmodes 
of  the  channel  as  follows  [6, 7]: 


M 


Sk{n)  = 


2  7T 


kr(m)r 


4(cnW’mMexp(-;  kr(m)  •  r). 

(1) 


Here  k ,  m,  and  n  index  the  time  of  the  snapshot,  the  mode 
number,  and  the  array  sensor  number,  respectively.  The  sum 
is  over  the  M  eigenmodes  %pm(z)  supported  by  the  chan¬ 
nel,  where  z  is  the  depth  coordinate.  The  eigenmodes  are 
sampled  at  Cn,  the  depth  of  the  nth  sensor,  and  at  d,  the 
depth  of  the  target.  The  amplitude  of  the  m th  mode  includes 
a  phase  factor  proportional  to  the  product  of  its  horizontal 
wave  number  kr(m)  and  the  target  range  r.  Rewriting  this 
expression  in  terms  of  the  Ar -dimensional  replica  vector  s* , 
where  N  is  the  number  of  hydrophone  sensors,  we  have: 

&k  (d,  r)  =  *  (c)  [$(d,  r)Gxk  (r)] ,  (2) 

where  ©  denotes  the  Hadamard,  or  element-by-element,  vec¬ 
tor  product.  Here  the  nth  element  of  s *  is  $*(n),  the  (n,  m) 
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element  of  ^  is  0m(cn),  and  the  mth  element  of  0(d,  r) 

is  y  ^rn(d).  The  modal  phases  have  been  collected 

into  a  vector  Xj.  ( r ),  whose  mth  element  is  given  by 
exp {-j  kr{m))  *  r). 

If  the  target  is  presumed  to  be  stationary,  so  that  the 
replica  vector  is  constant  across  a  window  of  K  snapshots, 
sk  -  s,  then  summing  over  the  matched-filter  output  of  K 
snapshots  yields  the  conventional  matched  field  processor, 
or  Bartlett  estimate  [8]: 


K 

Jfc=l 


kh. 


s^s 


(3) 


where  Ryy  =  ^  yky^,  is  the  sample  correlation  matrix 
of  the  data. 

This  estimator  is  also  justified  by  the  likelihood  of  the 
data,  as  a  function  of  depth  d  and  range  r,  over  the  window 
of  snapshots,  if  the  data  has  the  random  model  y  =  a*s  + 
nk.  Here  the  signal  has  a  Gaussian  distributed  complex  am¬ 
plitude  o/t  ~  CAr[0,  cr^],  and  there  is  additive  white  mea¬ 
surement  noise  n*  -  CN%a^S\.  Then  by  using  Wood¬ 
bury  identities  (a  brief  derivation  is  reproduced  in  in  Ap¬ 
pendix  3.7  of  [9]),  the  likelihood  of  the  k th  snapshot  can  be 
shown  to  be 


moving  in  range  at  a  constant  velocity  and  constant  depth. 
Target  motion  tends  to  smear  out  the  peak  target  power  across 
range,  reducing  peak  height  and  the  effective  post-beamformer 
SNR.  An  example  realization  of  an  ambiguity  surface  is 
shown  in  Figure  1,  where  the  target  has  moved  from  10  to 
10.5  km  over  50  snapshots  spaced  two  seconds  apart,  as  in¬ 
dicated  by  the  white  line  segment  bounded  by  stars.  Note 
that  there  are  ambiguous  peaks  at  other  ranges  at  the  same 
depth  as  the  target  (20m),  but  also  at  a  depth  of  about  83m. 
(The  cause  of  the  ambiguous  depth  will  be  discussed  below.) 

Environmental  mismatch  produces  mismatch  of  the  replica 
vector  s.  The  replica  vector  is  a  function  of  both  range  and 
depth,  but  it  is  range  localization  that  seems  to  be  more  se¬ 
riously  affected  by  mismatch,  as  shown  in  Figure  2.  To  un¬ 
derstand  this,  consider  Equation  2.  The  depth  dependence 
is  contained  in  the  vector  0,  the  mode  amplitudes  sampled 
at  the  source  depth.  This  vector  also  has  a  global  scaling 
inversely  proportional  to  range;  the  average  range  effec¬ 
tively  scales  the  signal  power  a2 .  The  primary  dependence 
on  range  is  contained  the  vector  of  modal  phases  x* ,  each 
phase  being  proportional  to  the  product  of  the  horizontal 
wavenumber  and  the  range,  kT(m)  •  r.  A  small  mismatch 
in  the  wavenumber  kr(m)  can  cause  a  big  mismatch  in  the 
phase,  as  it  is  multiplied  by  a  range  r  than  can  be  on  the 
order  of  several  kilometers. 


f(ykk(d,r))  = 


■  exp  <  - 


yiik 


(iHT2)"  (l  + 

ll^j/fell2  1 

<^(f| +  sfs)  J 


(4) 


Conditioned  on  s,  all  the  data  vectors  y  k  are  independent 
and  share  the  same  likelihood.  In  the  high  SNR  limit,  a2  » 
cr£,  their  joint  log-likelihood,  as  a  function  of  ( d ,  r),  is  pro¬ 
portional  to  the  Bartlett  estimate  of  Equation  3. 

An  alternative  approach,  the  MVDR  beamformer  or  Capon 
spectrum,  has  advantages  for  suppressing  interfering  sources 
and  sidelobes.  However  it  is  more  sensitive  to  target  nulling 
if  the  presumed  target  replica  vector  is  mismatched  with  re¬ 
spect  to  the  true  target  response.  In  this  work  we  investi¬ 
gate  robustness  with  respect  to  errors  in  environmental  pa¬ 
rameters,  which  can  produce  target  mismatch.  We  evalu¬ 
ate  our  Moving  Target  Depth  Estimator  (MTDE)  and  the 
Bartlett  estimator,  or  conventional  matched  field  processor, 
as  a  baseline  estimator  for  comparison. 


3.  TARGET  MOTION,  ENVIRONMENTAL 
MISMATCH,  AND  DEPTH  AMBIGUITY 

Two  phenomenon  which  degrade  the  performance  of  the 
Bartlett  estimator  are  target  motion  and  environmental  mis¬ 
match.  Here  we  examine  the  scenario  in  which  a  target  is 


Our  initial  investigation  of  environmental  mismatch  is 
based  on  the  simple  Pekeris  model,  with  a  uniform  sound- 
speed  in  the  water  channel,  and  in  the  ocean  bottom  [6,  7]. 
Mismatch  is  implemented  by  using  the  Pekeris  model  to 
generate  synthetic  data,  then  perturbing  the  vertical  wavenum¬ 
bers  kz  assumed  in  processing  the  data  and  forming  an  am¬ 
biguity  surface.  The  wavenumbers  are  perturbed  by  a  uni¬ 
form  random  variable  whose  extent  we  express  as  a  fraction 
of  the  approximate  spacing  of  the  vertical  wavenum¬ 
bers,  where  h  =  100m  is  the  depth  of  the  water  column. 
Then  the  assumed  modes  and  horizontal  wavenum¬ 

bers  kr  ~  yjk^  —  kl  are  computed  accordingly.  For  ex¬ 
ample,  in  Figure  2,  they  are  perturbed  by  ±4.5  ~ ,  the  total 
range  of  the  perturbations  being  0.9  J.  Note  that  while  the 
range  information  has  been  lost,  there  is  still  significant  en¬ 
ergy  distributed,  across  several  ranges,  at  the  correct  depth 
of  20m  and  at  an  ambiguous  depth  of  about  83.5  m 

To  see  the  source  of  the  depth  ambiguity,  refer  to  Fig¬ 
ure  3,  in  which  we  plot  0(d)  at  d=  20 m  and  d  =  83.5m,  as 
well  as  the  magnitude  of  0(d).  While  the  amplitudes  of  the 
modal  phases  are  different  at  the  two  depths,  undergoing  a 
relative  sign  change  every  other  mode,  the  magnitudes  are 
approximately  equal.  The  sign  change  can  be  compensated 
by  a  corresponding  sign  change  in  the  modal  phase  vector 
xk,  which  may  occur  at  another  range,  as  seen  in  both  Fig¬ 
ures  1  and  2.  For  a  ’’perfect”  constant-index  waveguide,  in 
which  the  modes  are  sinusoidal,  and  the  amplitudes  go  to 
zero  at  the  bottom,  the  ambiguity  is  exact.  In  this  special 
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case  the  modes  are  sinusoidal,  given  by  sin(  ^m).  Now  let 
us  tiy  to  identify  two  depths  d\  and  d?  at  which  the  mag¬ 
nitudes  of  the  modes  are  equal.  Equating  the  magnitudes 
yields 

sin  (— -m)  =  sm  (— m) 

a  h 

M  2tv  d2  ,  /CN 

— ►  cos( — - — m)  —  cos( — - — m).  (5) 

h  h 

Using  the  fact  that  discrete-time  sinusoidal  signals  of  the 
form  exp (ium)  are  the  same  for  frequencies  u  that  are  sep¬ 
arated  by  an  integer  multiple  of  we  have  the  following 
solution: 

27rdi  ,  27tc?2  ~  .... 

—  —  ±  —  —  —  27 r  —>•  d\  =  c^2  “  h.  (6) 

h  h 

Thus  depth  values  that  are  symmetric  about  the  the  middle 
of  the  water  channel  are  potentially  ambiguous.  The  am¬ 
biguous  depth  shown  in  Figure  3,  ~  83.5m,  is  a  little  bit 
deeper  that  the  depth  predicted  by  Equation  6, 80m,  since  a 
more  realistic  “soft”  boundary  condition  is  used  that  causes 
the  eigenmodes  to  be  non-zero  in  the  sediment  bottom,  and 
also  decreases  their  vertical  wavenumbers,  “stretching”  them 
slightly. 


4.  STATE  MODEL  FOR  TARGET  MOTION 

We  wish  to  accommodate  target  motion,  and  discard  the  pre¬ 
sumption  of  stationarity  of  the  replica  vector  used  in  obtain¬ 
ing  the  Bartlett  processor  ( sk  =  s).  We  accommodate  mo¬ 
tion  by  using  a  hidden  state  process,  where  the  state  is  given 
by  the  modal  phases  xk  of  Equation  2: 

=A(r)(ujt  ©x*_1).  (7) 

Here  the  state  transition  matrix  A(f )  is  a  diagonal  matrix  of 
phase  factors,  with  the  mth  element  being  given  by 
exp (~j  hr(m)-rAt),  where  r  is  range-rate/horizontal-velocity, 
and  At  is  the  time  between  snapshots.  The  initial  phase  vec¬ 
tor  Xq  is  unknown.  So  what  is  assumed  known  in  this  model 
is  not  the  initial  range  of  the  target,  but  only  the  change  in 
range,  or  range-rate.  The  state-noise  vector  u*  consists  of 
small  phase  perturbations.  Its  purpose  is  primarily  to  relax 
slightly  the  constraint  imposed  on  the  state  sequence  by  the 
presumed  horizontal  wavenumber  fcr(m),  which  may  have 
errors. 

Denote  the  data  matrix  having  the  first  k  data  vectors  as 
its  columns  by  Y*.  Our  goal  is  then  to  update  the  cumla- 
tive  likelihood  of  the  data,  given  a  depth  and  range-rate  pair 
/(Yfc|r,  d),  as  we  acquire  new  data  vectors  y_ fc.  If  the  state 
vectors  and  the  measurement  vectors  were  both  Gaussian, 
with  linear  transition  matrices,  then  we  could  apply  the  ex¬ 
pressions  of  Kalman  filtering.  The  Kalman  filter  equations 


provide  expressions  for  a  state  prediction,  measurement  pre¬ 
diction,  and  state  update;  these  are  the  conditional  means 
of  the  densities  f(xk |Y*_i),  f(y  k\Yk-i),  and  /(z^Y *). 
The  Kalman  equations  also  yield  expressions  for  the  error 
covariances  associated  with  the  estimates,  which  are  the  co- 
variances  of  the  three  densities.  The  conditional  means  and 
covariances  are  then  enough  to  characterize  the  densities, 
since  the  densities  are  Gaussian.  So  rather  than  viewing 
the  Kalman  filter  as  merely  updating  state  estimates,  we  can 
view  it  as  updating  these  densities,  needed  in  turn  to  update 
the  likelihood.  In  standard  applications  of  Kalman  filter¬ 
ing,  parameters  of  interest,  such  as  target  range  and  velocity, 
comprise  the  state  vector,  and  state  estimates  will  suffice.  In 
the  application  discussed  here,  the  state  consists  largely  of 
nuisance  parameters;  the  parameters  of  interest  must  be  in¬ 
ferred  from  the  cumulative  likelihood  function  /( Y  *|r,  d) 
estimated  from  approximations  to  these  densities. 


5.  SEQUENTIAL  RESAMPLING  FOR  STATE 
ESTIMATION 


In  lieu  of  an  analytic  expression  for  the  updated  likelihood, 
we  employ  a  method  of  sequential  resampling  [4,  5].  This 
represents  the  densities  parametrically,  by  a  collection  of 
samples,  known  as  “particles.”  Loosely,  we  can  think  of  the 
method  as  evolving  histograms  of  samples,  rather  than  ana¬ 
lytic  density  expressions.  The  process  is  as  follows  [4]:  at 
time-step  k ,  we  have  some  (prediction)  samples  which 

are  distributed  as  /(x*|Yfc_i).  The  first  step  is  to  scale, 
or  weight,  these  samples  according  to  the  likelihood  of  the 
kth  data  snapshot.  The  weights  are  proportional  to  this  like¬ 
lihood,  and  normalized  to  sum  to  one: 


(*) 

k 


f(yk\x.kl)) 

Z?f(ykk?}Y 


(8) 


For  our  application,  the  likelihood  was  obtained  by  substi¬ 
tuting  Equation  2  into  Equation  4.  In  the  second  step,  the 
samples  are  resampled  with  a  probability  given  by  the  like¬ 
lihood,  to  yield  a  new  set  of  (update)  samples  x  which  are 
distributed  as  /  (x*  |  Yfc ) : 


Prob  [xjjf*  =  **>]  =  (9) 


After  this  step,  one  typically  has  a  significant  number  of 
samples  xj.  that  are  identical/degenerate,  since  they  corre¬ 
spond  to  prediction  samples  x^  that  have  high  weight  val¬ 
ues.  In  the  third  step,  the  samples  are  translated  according 
to  the  state  transition  equation,  and  state  noise  is  added  . 
This  produces  new  prediction  samples: 

no) 
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(the  state  noise  uf*  has  the  side  effect  of  differentiating  de¬ 
generate  samples).  One  approach  is  to  run  separate  estima¬ 
tors,  or  “particle  filters,”  in  parallel,  one  for  every  hypoth¬ 
esized  range-rate  and  depth  (r,  d).  Then  the  likelihood  for 
each  (r,  d)  pair  can  be  approximated  as  follows: 

}{yk\Yk-i\r,d)  =  J  dxkf(yk\xk-,f,d)f(xk\Yk~1;r,d) 

N 

~  J2f(yk&kl)'>r,d).  (11) 

i 

And  the  cumulative  likelihood  is  given  by 

N 

f{Yk\r,d)  =  Y[  /(^|Y*_i;r,d).  (12) 

k= o 

In  practice,  running  parallel  particle  filters  is  likely  to  be 
computationally  prohibitive.  For  example,  100  grid  points 
in  both  depth  and  range  would  require  10,000  parallel  es¬ 
timators,  each  requiring  a  set  of  samples  to  represent  the 
likelihood. 

An  alternative  approach  is  to  include  the  parameters  of 
interest,  in  this  case  (r,  d)y  in  the  state  vector,  with  a  uni¬ 
form  prior.  Then  an  ambiguity  surface  may  be  obtained  by 
plotting  the  marginal  histogram  of  the  update  samples  , 
which  are  distributed  as  f(x^  |' Y*).  The  marginal  histogram 
gives  an  approximation  to  f(r ,  d\Yk),  which  with  a  uniform 
prior  on  (r,  d)  is  proportional  to  the  likelihood  /(Y*|r,  d). 

A  difficulty  of  this  approach  in  practice  is  that  the  sequential 
resampling  techniques  tend  to  display  the  behavior  of  com¬ 
petitive  dynamical  systems  (see,  for  example,  [10]).  That  is, 
even  if  two  sample  regions  have  an  equal  level  fitness  with 
respect  to  the  likelihood,  over  time  one  of  them  will  tend  to 
“win”,  and  monopolize  the  samples.  In  our  investigation  we 
observed  that  with  a  target  at  20m,  some  trials  would  show 
a  peak  at  20m,  while  other  trials  would  show  a  peak  at  the 
depth  ambiguity  of  83.5m.  So  the  ambiguity  surface  of  a 
single  trial  does  not  reflect  the  intrinsic  ambiguity  over  the 
ensemble  of  trials;  it  gives  an  over-optimistic  picture  of  the 
ambiguity  surface,  and  misleading  in  this  respect. 

To  alleviate  this  problem,  we  chose  a  hybrid  approach, 
putting  velocity  in  the  the  state  vector,  but  leaving  depth, 
which  we  treat  as  the  parameter  of  primary  interest,  out 
of  the  state  vector.  This  requires  a  separate  particle  filter 
for  each  hypothesized  depth.  We  chose  100  grid  points  in 
depth,  leading  to  100  corresponding  particle  filters.  The  his¬ 
togram  of  particles  at  each  depth  provides  an  estimate  of 
f(r\Yk,d).  At  each  depth,  we  can  use  the  equivalent  of 
Equation  12  in  order  to  obtain  the  likelihood: 

N 

/(Y*|d)  =  n/(EjY*-i;d).  (13) 

k=0 


Assuming  a  uniform  prior  on  d,  then  normalizing  this  with 
respect  to  d  provides  an  estimate  of  f(d\Y k).  We  can  then 
compute  an  ambiguity  surface  as 

/(M|Y*)  =  /(r|Y*,d)  -  /(d|Y*)-  (14) 

Again,  with  a  uniform  prior  on  (r,  d),  this  posterior  density 
is  proportional  to  the  likelihood  f(Y k  |r,  d). 

To  combat  the  problem  of  degeneracy  of  samples  we 
implemented  an  approach  suggested  in  [4].  Namely,  in  the 
state  prediction  step,  additional  state  noise  was  added  to  dif¬ 
ferentiate  degenerate  samples.  Since  range-rate  r  was  in¬ 
cluded  in  the  state,  noise  was  added  to  the  range-rate  values, 
with  a  standard  deviation  of  0.2  m/s.  In  addition,  the  5%  of 
the  samples  with  the  largest  weights  were  automatically  re¬ 
tained  for  the  next  step,  to  mitigate  against  losing  a  sample 
value  on  the  basis  of  a  single  snaphot  only. 


6.  SIMULATION  AND  RESULTS 

Figures  4  and  5  show  two  ambiguity  surfaces  obtained  in 
this  manner,  for  a  surface  and  submerged  target,  respec¬ 
tively.  The  SNR  per  sensor  element  was  set  at  0  dB.  At 
each  hypothesized  depth  we  ran  a  particle  filter  with  500 
samples  or  “particles”.  Depth  estimates  were  obtained  by 
taking  the  maximum  of  /(Y*|d).  Histograms  of  depth  es¬ 
timates  for  a  surface  target,  obtained  from  100  trials  of  the 
Bartlett  estimator  and  the  MTDE  estimator,  are  shown  in 
Figure  6.  Similar  histograms  are  shown  for  a  submerged 
target  at  a  depth  of  20m  in  Figure  7.  Note  the  ambiguity 
at  a  depth  of  about  83.5m.  Because  we  run  parallel  particle 
filters  for  all  hypothesized  depths,  the  ambiguity  surface  for 
a  single  trial  of  the  MTDE  estimator,  as  in  Figure  5,  diplays 
this  ambiguity. 

To  investigate  the  robustness  of  the  estimators  with  re¬ 
spect  to  environmental  uncertainty,  the  probability  of  cor¬ 
rect  localization  (PCL)  of  the  target  is  plotted  versus  in¬ 
creasing  environmental  uncertainty  in  Figure  8.  The  region 
of  correct  localization  includes  ±2m  around  the  true  tar¬ 
get  depth,  and  around  the  depth  ambiguity.  As  discussed  in 
Section  3,  the  vertical  wavenumbers  in  the  estimator  were 
perturbed  by  a  uniform  random  variable  whose  range  is  ex¬ 
pressed  as  a  fraction  of  The  PCL  is  plotted  as  this  frac¬ 
tion  is  increased  from  0  to  1.5.  This  environmental  per¬ 
turbation  does  not  significantly  degrade  the  localization  of 
the  surface  target,  but  it  does  degrade  the  localization  of  the 
submerged  target.  The  degradation  is  not  as  severe  for  the 
MTDE  estimator  as  it  is  for  the  Bartlett  estimator. 
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7.  CONCLUSIONS 


In  this  paper  we  have  investigated  an  approach  of  joint  esti¬ 
mation  of  range-rate  and  depth,  rather  than  range  and  depth. 
Range-rate  provides  another  dimension  with  which  to  dis¬ 
criminate  targets  against  interfering  sources  (such  as  mov¬ 
ing  ships).  In  addition,  discrimination  based  on  range-rate 
is  more  robust  with  respect  to  environmental  uncertainties, 
as  verified  by  simulations.  In  lieu  of  an  analytic  expression 
of  the  updated  likelihood,  we  have  investigated  a  technique 
of  sequential  resampling  or  particle  filtering.  The  limitation 
of  this  particular  technique  seems  to  be  its  ability  to  com¬ 
pensate  for  low  SNR  by  integrating  over  many  snapshots.  It 
should  be  emphasized,  however,  that  this  is  a  limitation  of 
the  particle-filter  implementation  investigated  here,  and  not 
a  limitation  of  the  basic  state-model  approach  of  localizing 
with  respect  to  range-rate  and  depth,  rather  than  range  and 
depth.  Our  future  work  will  be  focused  on  implementations 
that  more  effectively  exploit  the  entire  data  history. 
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Figure  3:  Modes  at  two  ambiguous  depths:  (a)  their  ampli¬ 
tudes,  and  (b)  their  magnitudes. 
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Figure  4:  MTDE  estimator:  depth/range-rate  ambiguity  sur¬ 
face  for  a  surface  target. 
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Figure  5:  MTDE  estimator:  depth/range-rate  ambiguity  sur¬ 
face  for  a  submerged  target. 
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Figure  6:  Histograms  of  depth  estimates  for  a  surface  target. 
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Figure  7:  Histograms  of  depth  estimates  for  a  submerged 
target. 
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Figure  8:  Probability  of  correct  localization  for  Bartlett  and 
MTDE  estimators,  on  both  surface  and  submerged  targets. 
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ABSTRACT 

An  approach  is  developed  for  adaptive  beamforming  for  mobile 
sonars  operating  in  an  environment  with  moving  interference 
from  surface  shipping.  It  is  assumed  that  the  sound  source  of 
each  ship  is  drawn  from  an  ensemble  of  Gaussian  random  noise, 
but  each  ship  moves  at  constant  speed  along  a  deterministic 
course .  An  analytic  expression  for  the  ensemble  mean 
covariance  is  obtained.  In  practice  the  location ,  course,  speed, 
mean  noise  level,  and  transmission  loss  of  each  interferer  are 
not  known  with  sufficient  precision  to  use  the  modeled  ensemble 
mean  as  a  basis  for  adaptive  beamforming .  The  problem  is  thus 
to  accurately  estimate  the  ensemble  mean  based  on  data 
samples.  The  analytic  ensemble  mean  is  not  stationary,  and 
thus  is  not  well  estimated  by  the  sample  mean .  The  ensemble  of 
covariance  samples  consists  of  rapidly  varying  random  terms 
associated  with  the  emitted  noise  and  more  slowly  oscillating 
deterministic  terms  associated  with  the  source  and  receiver 
motion.  The  non-stationary  ensemble  covariance  mean  can  be 
estimated  by  filtering  out  the  rapidly  varying  noise  while 
retaining  the  slow  oscillatory  terms.  Performance  of  the  filters 
can  be  visualized  and  assessed  in  the  " epoch  ”  frequency 
domain,  the  Fourier  transform  of  the  covariance  samples.  In 
this  domain,  higher  bearing  rates  show  up  at  higher 
frequencies.  The  traditional  sample  mean  estimator  retains  only 
the  zero-frequency  bin  corresponding  to  stationary 
interference.  Techniques  that  can  identify  and  include  the 
appropriate  non-zero  frequency  contributions  are  better  non- 
stationary  estimators  than  the  sample  mean.  Several  such 
techniques  are  offered  and  compared.  Simulations  are 
invaluable  in  evaluating  the  filter  performance,  since  the 
ensemble  mean  can  be  precisely  calculated  analytically  in  the 
simulation,  and  compared  directly  with  the  sample  estimates. 
Simulations  of  adaptive  beamformers  using  covariance  filtering 
will  be  shown  to  yield  improved  robustness  to  shipping  motion. 

1.  INTRODUCTION 

At  low  frequencies,  underwater  noise  is  dominated  by 
shipping  sources.  These  sources  can  be  extremely  loud, 
and  can  dominate  the  performance  of  low-frequency 
passive  sonar  systems.  Since  these  sources  are  typically 
spatially  discrete,  adaptive  techniques  ought  to  apply  to 
eliminate  their  influence  when  surveillance  is  performed  in 
locations  in  between  the  loud  ships.  Unfortunately,  the 
shipping  sources  are  moving,  and  hence  violate  the 
stationary  noise  assumptions  of  current  adaptive 


techniques.  Current  implementations  of  adaptive 
beamformers  often  do  not  achieve  much  gain  above 
conventional,  non-adaptive  beamformers  and  hence  remain 
limited  by  the  loud  sources  of  interference.  Here  we 
suggest  a  new  class  of  techniques  that  may  robustly 
achieve  the  rejection  of  loud  sources  of  moving 
interference. 

2.  PHYSICAL  MODEL  OF  SHIPPING 

NOISE 

Current  adaptive  techniques  are  based  on  the  physical 
assumption  that  the  sources  of  interference  are  stationary 
in  space.  This  is  clearly  not  valid  for  the  case  of  moving 
ocean  shipping  sources.  Hence,  we  must  develop  a  new 
physical  model  for  the  interference  in  order  to  derive  the 
appropriate  adaptive  processing. 

2.1  Pressure  Field 

Begin  by  assuming  an  arbitrary  set  of  ships  under 
deterministic  motion  in  an  arbitrary  underwater  sound 
channel.  We  focus  on  a  single  frequency,  with  the 
assertion  that  the  model  can  be  extended  to  the  broadband 
case  by  a  straightforward  summation  across  frequencies. 

In  the  selected  frequency  bin,  it  is  reasonable  to  model  the 
sound  source  of  each  ship  by  a  draw  from  an  ensemble  of 
complex  Gaussian  random  noise,  and  assume  that  the 
noises  of  different  ships  are  fundamentally  independent 

These  sources  are  then  propagated  to  each  receiver  array 
element.  The  propagation  may  be  described  by  a  coherent 
sum  over  modes  [  1  ] .  In  a  range  independent  environment, 
these  modes  arise  naturally  with  the  use  of  a  normal  mode 
propagation  model.  In  range-dependent  environments,  the 
propagation  can  be  expanded  as  a  sum  of  local  modes  in 
the  vicinity  of  the  receiver.  This  local  mode  expansion  is 
explicit  via  the  use  of  coupled  or  adiabatic  mode 
propagation  models,  but  in  principle  can  be  obtained  from 
the  field  output  of  any  propagation  modeling  technique. 
The  received  acoustic  pressure  pn  at  the  nth  element  in  an 
array  is  a  sum  across  ships  of  the  sum  over  the  local 
modes: 


97 


j  m 

where  Sj  is  the  source  noise  sample,  A™  is  the  mode 
amplitude  and  km  is  the  wavenumber  of  the  mth  mode,  and 
rjn  is  the  range  from  the  jth  ship  to  the  nth  receiver  element. 
Note  that  the  mode  amplitudes  must  incorporate  cylindrical 
spreading  and  attenuation  terms  not  given  explicitly  here. 
The  pressure  consists  of  random  contributions  from  the 
ship  noise  sources  and  deterministic  time-varying 
propagation  contributions. 

2.2  Covariance 

Optimal  adaptive  processing  is  determined  from  the  mean 
of  the  covariance  among  sensor  pressures.  This 
expectation  must  be  taken  across  the  random  ensemble  of 
ship  sources.  The  ensemble-mean  covariance  will  be  a 
function  of  time  because  of  the  time  varying  propagation 
terms.  Therefore  the  expected  covariance  cannot  be 
obtained  directly  from  a  sample  mean  across  time  samples 
of  the  covariance.  Using  the  independence  of  different 
ships  an  analytic  expression  for  the  ensemble  covariance  is 
obtained: 

(Pn,Pn2*)  =  'L'L'Z(sJSj*) 

J  ml  m2 

XA  A  (f  )-*m2  rJ«2  (O) 

m\n\  m2n2 

where  the  brackets  indicate  the  expectation  across  the 
ensemble.  The  term  <Sj  Sj*>  is  the  power  spectrum  of  the 
jth  source. 

If  the  ranges,  propagation  modes,  and  source  level  power 
spectra  were  all  known,  this  model  expression  could  be 
calculated  at  each  time  and  used  in  a  standard  minimum 
variance  distortionless  response  (MVDR)  full-rank  ABF 
[2,  3].  This  approach  might  be  termed  the  full  knowledge 
a  priori  model-based  MVDR  method.  Such  an  ABF 
would  move  its  nulls  in  time  to  optimally  reject  noise  from 
all  the  moving  ships.  Unfortunately,  it  is  unlikely  in 
practice  that  full  knowledge  will  be  available  a  priori. 
Precisely  predicting  the  propagation  structure  is  quite 
difficult  given  the  spatial  and  temporal  variability  of  the 
ocean.  It  is  also  unlikely  that  the  exact  source  power 
spectra  will  be  known  for  every  contributing  ship.  Thus, 
we  usually  must  attempt  to  estimate  the  unknowns  in  the 
ensemble  mean  covariance  from  data  samples. 

3.  ALGORITHMS 


Since  the  ensemble  mean  involves  deterministic  time- 
varying  terms,  it  cannot  be  reliably  estimated  directly  from 
a  sample  mean  taken  over  time.  In  particular,  the 
oscillatory  nature  of  the  exponential  terms  will  produce  a 


sample  mean  that  tends  to  zero  over  long  estimation  times, 
while  the  ensemble  mean  is  significantly  larger.  To  avoid 
underestimating  the  ensemble  mean,  alternatives  to  the 
sample  mean  are  considered. 

3.1  Fourier  Analysis  and  Synthesis 

An  alternative  to  sample  averaging  is  to  apply  fourier 
analysis  to  covariance  samples.  One  motivation  for  this 
approach  is  to  separate  the  differing  time  scales  involved. 
The  random  source  noise  varies  rapidly  from  one  sample 
to  the  next.  This  rapid  variation  produces  a  sample  noise 
that  is  nearly  white.  This  sample  noise  will  corrupt 
estimates  of  the  ensemble  mean  covariance  unless  it  is 
removed.  The  deterministic  amplitudes  and  phases  from 
the  propagation  terms  vary  more  slowly  and  continuously 
in  time.  A  low  pass  filter  is  expected  to  separate  the 
rapidly  varying  sample  noise  from  the  slowly  varying 
propagation  terms.  Since  filter  behavior  is  often  best 
analyzed  in  the  frequency  domain,  this  motivates 
transforming  the  covariance  samples  to  a  corresponding 
frequency  domain.  This  domain  will  be  referred  to  as  the 
epoch  frequency  domain  to  distinguish  it  from  the  acoustic 
frequency. 

A  second  motivation  for  considering  the  Fourier  transform 
of  the  covariance  samples  can  be  obtained  by  considering 
the  time  dependence  of  the  propagation  terms.  The 
propagation  amplitudes  typically  evolve  very  slowly  in 
time,  and  this  variation  made  be  neglected  for  the  moment. 
The  most  rapidly  changing  term  is  the  phase  term  due  to 
the  changing  ranges  to  the  interference  sources.  Expand 
the  ranges  in  a  Taylor  series  about  some  reference  time: 

r~r0  +#  +  ... 

where  rO  is  the  range  at  the  reference  time  t=0  and  r  is  the 
initial  range  rate  of  the  source.  Again  for  the  moment, 
higher  order  terms  will  be  neglected.  The  ensemble 
covariance  can  now  be  approximated  by 


v,‘)=XXX(v, 


*)A  A 

j  mjrZ]  m2n2 


j  ml  m2 


X  r*ni  °  rjn2  °  V 

In  this  form,  the  unknowns:  source  power  spectra  and 
propagation  amplitudes  are  coefficients  of  sinusoidal 
complex  exponentials  with  epoch  frequencies 

Q  =  kmX  rjn\  ”  km2 ' jn2  •  This  suggests  that  these  unknown 

coefficients  can  be  estimated  by  Fourier  analysis.  Once 
the  coefficients  are  estimated  then  the  original  time  series 
for  the  ensemble  covariance  is  reconstructed  via  Fourier 
synthesis. 

The  overall  approach  is  summarized  as  follows.  First 
obtain  time  samples  of  the  elements  of  the  covariance 
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matrix,  as  is  currently  done  in  ABF.  For  each  matrix 
element,  transform  the  time  samples  of  covariance  to  the 
epoch  frequency  domain.  Identify  the  appropriate 
frequencies  associated  with  the  moving  ships,  and  use 
those  frequency  coefficients  to  synthesize  the  ensemble 
mean  time  series.  The  only  portion  of  the  algorithm 
remaining  to  specify  is  the  technique  of  identifying  which 
frequencies  in  the  epoch  domain  are  associated  with  the 
shipping  noise  sources  and  which  are  dominated  by  sample 
noise.  Several  methods  can  be  employed. 

3.2  Covariance  Low-pass  Filtering  Methods 

Since  ships  generally  do  not  change  range  significantly 
within  a  few  time  samples,  the  shipping  noise  is  expected 
to  nearly  always  occur  in  the  lowest  frequency  bins  of  the 
epoch  frequency  domain,  while  sample  noise  is  expected  to 
be  nearly  white  across  all  bins.  Hence,  appropriate  low 
pass  filters  are  expected  to  retain  much  of  the  shipping 
noise  energy  to  be  estimated,  while  rejecting  the  sample 
noise.  The  best  selection  of  pass  band  is  made  based  on 
the  expected  motion  of  the  contributing  ships.  Current 
ABF  algorithms  that  employ  the  sample  mean  are  in  fact 
an  example  of  covariance  low  pass  filtering,  since  the 
sample  mean  is  the  low  pass  filter  that  retains  only  the 
spectral  power  in  the  zero-frequency  bin.  The 
performance  of  any  low  pass  filter  can  be  improved  by 
matching  the  filter  width  to  the  expected  epoch  frequency 
widths  associated  with  typical  ship  motion.  For  rapidly 
moving  ships,  this  can  be  achieved  by  retaining  more 
frequency  bins  in  the  filter.  In  order  to  further  improve 
over  current  algorithms,  advantage  must  be  taken  of  the 
specifics  of  the  epoch  frequency  structure  of  the  shipping 
noise. 

The  epoch  frequency  for  each  ship  given  above  depends  of 
the  difference  of  the  products  of  a  wavenumber  times  a 
range  rate.  Underwater  acoustic  wavenumbers  of  the 
significant  modes  generally  do  not  exhibit  much  spread. 
Furthermore,  for  operational  horizontal  line  arrays,  the 
interfering  ships  will  almost  always  occur  at  ranges 
significant  relative  to  the  horizontal  separation  between 
array  elements.  In  these  cases  the  epoch  frequency  where 
a  ship  contributes  can  be  approximated  by 

Q  ~  k0AxOsm6 

where  ko  is  a  reference  wavenumber,  Ax  is  the  horizontal 
separation  between  elements,  6  is  the  bearing  to  the  ship 
(relative  to  the  line  between  the  elements),  and  6  is  the 
bearing  rate.  Note  that  the  epoch  frequency  increases 
approximately  linearly  with  separation  between  elements. 
This  suggests  a  second  filtering  approach,  in  which  the  low 
pass  filter  frequency  width  is  increased  linearly 
proportionally  to  separation.  Elements  near  the  main 
diagonal  of  the  covariance  matrix  are  less  affected  by 
source  motion,  and  hence  can  be  estimated  with  narrower 


low-pass  filters.  The  most  separated  elements  at  the 
farthest  comers  of  the  matrix  are  the  most  subject  to  source 
motion,  and  require  the  highest  bandwidth  low-pass  filter. 
The  maximum  bandwidth  can  be  selected  to  match  the 
highest  bearing  rate  typically  encountered. 

3.2  Covariance  Band-pass  Filtering  Methods 

Further  improvements  in  estimation  may  be  potentially 
obtained  by  retaining  only  those  epoch  frequency  bins 
containing  significant  shipping  noise.  One  method 
involves  partial  knowledge  available  a  priori .  When  the 
locations  and  tracks  of  the  significant  ships  are 
independently  known,  for  example  from  radar  surveillance, 
then  the  bearing  rates  can  be  calculated  and  the  epoch 
frequency  bins  identified.  The  energy  in  the  identified  bins 
then  represents  estimates  of  the  unknown  propagation  and 
source  level  terms.  Fourier  synthesis  using  only  the 
identified  bins  produces  the  desired  covariance  time  series. 
The  entire  process  can  be  described  as  a  set  of  band  pass 
filters,  where  each  narrow  pass  band  is  selected  based  on 
the  knowledge  a  priori  of  the  bearing  rates. 

When  no  knowledge  is  available  a  priori ,  the  potential 
exists  to  take  advantage  of  the  linear  dependence  on 
separation.  Energy  from  each  individual  ship  will  lie  along 
a  line  in  the  separation-epoch  frequency  plane.  Line 
detection  methods  in  this  plane  have  the  potential  to 
automatically  identify  the  appropriate  bearing  rates 
associated  with  significant  interfering  energy.  Such 
methods  may  include  Radon  or  Hough  transforms  [4]. 

Once  the  appropriate  bins  have  been  identified,  band  pass 
filters  can  be  constructed  to  filter  the  shipping  noise  from 
the  sample  noise. 

4.  SIMULATION 

A  simulation  was  performed  to  demonstrate  the  potential 
utility  of  these  techniques.  In  the  simulation,  the  exact 
ensemble  mean  can  be  calculated  since  all  quantities  are 
known.  Adaptive  processing  based  on  this  exact  mean 
covariance  gives  ah  upper  bound  to  the  maximum 
performance  that  could  be  achieved,  if,  for  example, 
perfect  knowledge  were  available  a  priori .  In  addition  to 
the  ensemble  mean,  the  simulation  generated  time  samples 
of  covariance  from  four  moving  ships  with  Gaussian  noise 
sources.  The  ships  were  moving  at  realistic  speeds  from 
between  10  and  20  knots.  The  tracks  of  the  ships  are 
shown  in  figure  1 .  Noise  from  the  ships  was  propagated 
with  cylindrical  spreading  in  a  single  mode  underwater 
channel.  The  noise  was  received  on  a  line  array  of  50 
elements  with  a  design  frequency  of  60  Hz.  The 
simulation  was  performed  at  this  design  frequency. 
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Figure  1 .  Tracks  of  the  four  ships  in  the  simulation 
relative  to  array  at  origin 


The  simulation  calculated  the  conventional  beamformer 
response  and  compared  with  various  MVDR  beamformer 
responses  for  a  set  of  beams  spanning  all  azimuths.  The 
beamformers  used  various  estimates  of  the  ensemble  mean 
covariance.  In  addition  to  the  exact  ensemble  mean,  the 
ABF  based  on  the  sample  mean  and  the  ABF  based  on  an 
element  dependent  low-pass  filter  were  simulated.  Results 
of  the  simulation  are  summarized  by  cumulative 
distributions  of  noise  across  all  beams  shown  in  figure  2. 


Relative  Beam  Noise  Level 


Figure  2.  Cumulative  distributions  of  noise  for  various 
beamformers 


The  sample  mean  ABF  performed  almost  no  better  than 
conventional  non-adaptive  ABF  in  the  simulation.  The 
sample  mean  was  unable  to  properly  capture  the  motion  of 
the  ships,  and  hence  was  unable  to  place  nulls  in  the  proper 
locations  to  cancel  the  ship  noise.  The  element  dependent 
ABF  filtered  sample  covariance  showed  a  median 
improvement  of  about  5  dB  reduction  in  the  noise  over  the 
sample  mean  approach.  The  perfect  ensemble  mean 
displayed  10  dB  reduction  in  noise  beyond  the  sample 
mean  method. 


5.  CONCLUSIONS 

The  problem  of  adapting  in  the  presence  of  moving 
sources  of  interference  was  considered.  Application  was 
particularly  addressed  to  the  motion  of  interfering  surface 
ship  noise  for  passive  sonar  arrays.  The  physics  of  ship 
motion  was  modeled,  including  the  received  noise  field 
and  the  noise  covariance  matrix.  An  analytic  expression  of 
the  ensemble  mean  covariance  was  obtained.  This 
physical  model  suggested  a  new  approach  of  covariance 
filtering  to  better  estimate  the  ensemble  mean  covariance 
from  data  samples. 

Two  paradigms  of  current  adaptive  beamforming  may  need 
to  be  abandoned  in  the  presence  of  interference  motion. 
First,  the  sample  mean  may  not  be  the  appropriate 
estimator  when  the  interference  sources  are  in  motion. 
Second,  the  covariance  matrix  may  not  be  treated  as  a 
single  entity,  since  motion  affects  different  elements  of  the 
matrix  differently. 

The  behavior  of  the  covariance  under  interference  motion 
can  be  visualized  in  the  epoch  frequency  domain.  This 
domain  is  the  Fourier  transform  of  the  samples  of  the 
covariance  matrix.  It  was  observed  that  energy  from  each 
moving  ship  falls  along  an  approximate  line  in  the  epoch 
frequency  /  element  separation  plane.  Several  methods  for 
obtaining  improved  estimates  of  the  ensemble  mean 
covariance  were  suggested.  Preliminary  investigations  of 
relative  performance  of  a  few  of  these  methods  were 
obtained  via  a  simulation. 

Much  remains  to  be  done  to  develop  these  methods  further. 
There  is  great  potential  for  refinement  of  the  algorithms 
and  development  of  better  filtering  techniques.  The  epoch 
frequency  domain  has  only  begun  to  be  explored.  Line 
detection  techniques  have  yet  to  be  attempted.  It  has  been 
suggested  that  the  covariance  matrix  may  also  have  a  near- 
toeplitz  structure  in  the  epoch  frequency  domain  [5].  If  so, 
then  toeplitz  averaging,  or  low-pass  filtering  along  the 
toeplitz  directions  may  provide  additional  rejection  of 
sample  noise.  Finally,  applications  of  this  class  of 
techniques  to  real  data  are  certainly  warranted. 
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ABSTRACT 

A  beamspace  adaptive  beamformer  implementation  for 
the  rejection  of  cable  strum  self-noise  on  passive  sonar 
towed  arrays  is  presented  The  approach  focuses  on  the 
implementation  of  a  white  noise  gain  constraint  based 
on  the  scaled  projection  technique  due  to  Cox  et  aL 
[IEEE  Trans  on  ASSP,  VoL  35  (10),  Oct  1987].  The 
objective  is  to  balance  the  aggressive  adaptation 
necessary  for  nulling  the  strong  mainlobe  interference 
represented  by  cable  strum  against  the  conservative 
adaptation  required  for  protection  against  signal  self  ¬ 
nulling  associated  with  steering  vector  mismatch. 
Particular  attention  is  paid  to  the  definition  of  white 
noise  gain  as  the  metric  that  reflects  the  level  of 
mainlobe  adaptive  nulling  for  an  adaptive  beamformer . 
Adaptation  control  is  subsequently  performed  through 
the  implementation  of  a  constraint  on  maximum 
allowable  white  noise  gain  at  the  output  of  the  adaptive 
processor .  The  theoretical  development  underlying  the 
scaled  projection  based  constraint  implementation  is 
reviewed  Towed  array  data  results  depicting  the 
performance  gain  of  the  new  ABF  algorithm  optimized 
for  strum  cancellation  relative  to  that  of  a  more 
conservative  baseline  ABF  algorithm  are  presented 

1.  INTRODUCTION 

Hydrodynamic  self-noise  on  passive  sonar  towed  arrays 
has  been  a  well-known  performance-limiting  factor  for 
ocean  acoustic  source  detection  at  low  frequency  [1]. 
High  wavenumber  mechanical  vibrations  are  induced  in 
the  array  by  vortex  shedding  associated  with 
hydrodynamic  flow  over  the  array  body  and  cable  scope. 
These  vibrations  are  know  to  couple  into  the  hydrophone 
array  as  coherent  acoustic  noise  sources  and  can  impair 
acoustic  detection  performance,  particularly  in  the 
forward  endfire  direction.  As  a  direct  consequence  of  its 
spatially  coherent  nature,  it  has  been  shown  that  cable 
strum  noise  effects  can  be  mitigated  via  adaptive 


processing  [2].  In  this  work,  a  new  approach  to  coherent 
strum  noise  mitigation,  based  on  a  beamspace  adaptive 
beamformer  (ABF)  architecture  with  a  white  noise  gain 
constraint  (WNGC)  that  emphasizes  mainlobe 
interference  nulling  is  introduced.  Finally,  data  results 
illustrating  the  performance  improvement  over  an 
existing  beamspace  ABF  algorithm  that  emphasizes 
robustness  to  mismatch-induced  self-nulling  are 
presented. 

2.  THE  PHYSICS  OF  CABLE  STRUM 
2.1  Vortex  shedding 

When  an  array  is  subject  to  hydrodynamic  flow  with  a 
component  normal  to  its  axis,  a  wake  is  formed.  When 
the  velocity  of  the  transverse  flow  increases  beyond  a 
certain  threshold,  eddies,  or  vortices,  begin  to  form  and 
separate  from  the  wake.  Eventually  these  vortices  shed 
from  the  wake  in  an  asymmetric  fashion  [3].  This 
asymmetric  shedding  imparts  an  oscillatory  lift  force 
locally  on  the  array  which,  depending  on  the  properties 
of  the  array  such  as  tension  and  density,  can  excite 
transverse  vibrations  which  propagate  along  the  array 
axis.  The  frequency  of  vortex  shedding  in  hydrodynamic 
flow  is  related  to  properties  of  the  flow  and  the  array  via 
the  empirically  determined  Strouhal  relation  [1]: 


where  S  is  the  Strouhal  number,  equal  to  0.21  in  the 
laminar  flow  regime  characteristic  of  most  towed  array 
environments,  v  is  the  velocity  of  flow  normal  to  the 
array  axis,  and  d  is  the  cable  diameter.  Note  that  the 
normal  component  of  velocity  of  flow  can  vary  with  time 
in  response  to  platform  motion  and  local  inhomogeneities 
in  the  turbulent  medium 

The  transfer  function  to  which  the  Strouhal  excitation  is 
applied  is  governed  by  the  wave  equation  subject  to  the 
boundary  conditions  of  the  array  under  tow.  For  example, 
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assuming  fixed  boundary  conditions  for  the  array,  the 
preferred  frequencies  of  vibration  or  modes  of  the  array 
corresponding  to  the  solution  of  the  wave  equation  is 
given  by: 


where  T  is  cable  tension,  itic  is  mass  per  unit  length  of  the 
cable,  and  L  is  the  cable  length.  Figure  1  depicts  notionally  the 
interaction  of  the  Strouhal  excitation  with  the  structural  modes 
of  the  array.  Cable  strum  due  to  vortex  shedding  is  strongly 
excited  when  the  Strouhal  excitation  frequency  is  closely 
aligned  with  a  resonant  mode  of  the  cable  transfer  function. 

2.2  Wavenumber-frequency  analysis 

The  decomposition  of  an  array  snapshot  into  its  constituent 
acoustic  and  non-acoustic  components  is  accomplished  using  a 
wavenumber-frequency,  or  k-co,  transform.  The  k-co  transform 
is  a  2-d  FFT  in  space  and  time.  Maximum  unambiguous 
wavenumber  resolvable  is  equal  to  7t/d,  where  d  is  the  sensor 
spacing.  Resolution  in  wavenumber  is  governed  by  the  aperture 
length,  L.  For  non-dispersive  propagation,  frequency  and 
wavenumber  are  linearly  related  via 


where  cp  equals  the  phase  speed  of  the  wavefront. 

Figures  2  and  3  depict  k-co  plots  for  two  towed  arrays 
under  consideration  in  this  work.  The  first  exhibits 
superior  vibration  isolation  and  higher  resolution  due  to 
its  longer  aperture.  This  array  experiences  only  weak 
sidelobe  leakage  of  vibrational  modes  into  the  acoustic 
cone.  As  such,  under  nominal  operating  conditions,  this 
array  does  not  exhibit  a  pronounced  cable  strum 
interference  problem.  The  second  array  is  characterized 
by  limited  vibration  isolation.  It  is  subject  to  significant 
leakage  of  vibrational  energy  into  the  acoustic  cone  via 
mainlobe  penetration  in  forward  endfire.  Leakage  of 
vibrational  energy  into  acoustic  forward  endfire  is  a 
strong  function  of  own-ship  tow  speed  For  this  array, 
which  is  the  subject  array  for  this  paper,  cable  strum 
represents  a  significant  mainlobe  interference  problem. 


3.  BEAMSPACE  ABF  FOR  CABLE 
STRUM 

The  ABF  architecture  under  consideration  in  this  paper 
consists  of  a  frequency-domain  beamspace  adaptive 
beamformer.  The  adaptive  beamspace  consists  of  a  7- 


dimensional  beam  fan  with  fixed  cosine  spacing.  The 
beam  fan  translates  with  steering  direction. 

The  beamspace  ABF  derives  its  cable  strum  nulling  capability 
from  the  feet  that  near  endfire  the  beam  fen  is  partially 
composed  of  beams  steered  to  high  wavenumber  non-acoustic 
space. 

For  each  time  epoch,  the  element  timeseries  are  transformed  to 
the  frequency  domain  via  FFT.  A  beamspace  covariance  matrix 
is  formed  for  each  frequency  bin  independently  and  a  7- 
dimensional  beamspace  MVDR  weight  vector  is  subsequently 
computed.  Adaptation  control  is  governed  by  setting  a  limit  on 
the  maximum  allowable  white  noise  gain  for  the  adaptive 
processor. 

3.1  White  Noise  Gain 

White  noise  gain  (WNG)  is  defined  as  the  gain  applied 
by  the  adaptive  beamformer  to  a  spatially  white  input 
noise  process,  and  is  represented  by 

WNG  =  wHw, 

where  W  represents  the  MVDR  beamformer  steering 
vector  given  by 

_  2T!v 
W  =— — — . 

vhR~'v 

The  vector  v  represents  the  CBF  weight  vector  and  the 
matrix  if  denotes  the  sample  covariance  for  the  current 
processing  bin.  (Actually,  the  beamformer  WNG  is  a 
quantity  equally  applicable  to  the  output  of  the  CBF 
beamformer,  expressed  as  v”v).  Beamformer  WNG  is  a 
measure  of  the  level  of  mainlobe  adaptive  nulling 
effected  by  the  beamformer  steering  vector.  As  such,  a 
constraint  on  maximum  allowable  WNG  can  be  used  to 
control  the  level  of  mainlobe  adaptation  of  the  adaptive 
beamformer  relative  to  that  of  the  ideal  conventional 
beamformer: 

wHw  <—. 

N 

Here  f}  is  a  constant  ranging  from  1  to  infinity,  with  1 
representing  CBF  performance  (no  adaptive  nulling 
capability  and  best  robustness  to  mismatch)  and  infinity 
representing  MVDR  performance  (most  adaptive  nulling 
capability  and  most  sensitivity  to  mismatch).  Note  that 
under  this  convention,  the  quantity  1/N  represents  the 
WNG  of  the  conventional  CBF  beamformer,  where  N 
equals  the  number  of  elements  in  the  array. 

The  relative  WNG  is  a  particularly  important  metric  to 
consider  when  the  source  of  interference  lies  within  the 
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beamformer  mainbeam.  Figure  4  depicts  the  behavior  of 
the  WNG  of  the  minimum  variance  distortionless 
response  (MVDR)  ABF  relative  to  that  of  CBF  for  a 
simulation  scenario  in  which  an  interferer  is  swept  across 
cosine  space  and  permitted  to  penetrate  the  beamformer 
mainbeam.  An  elevation  in  WNG  results  from  the  ABF 
algorithm  attempting  to  drive  a  mainlobe  null  concurrent 
with  satisfying  the  MVDR  unity  gain  constraint  in  the 
steering  direction.  The  three  inset  figures  show  ABF 
(shown  in  red)  and  CBF  (shown  in  blue)  beampattems  in 
the  vicinity  of  the  steering  direction  for  three  different 
interferer  cosine  positions.  The  sequence  attempts  to 
connect  the  WNG  cosine  dependence  with  the  ABF 
beampattem  shape  as  the  interferer  cosine  approaches  the 
steering  direction.  When  the  interferer  is  far  in  the 
sidelobe  of  the  array  beampattem  (inset  3),  the  ABF  and 
CBF  mainlobe  beampattems  effectively  overlap.  In  this 
case,  a  simple  sidelobe  null  (not  pictured)  is  all  that  is 
needed  in  order  to  maximize  signal-to-interference-plus- 
noise  ratio  (SINR).  As  the  interferer  penetrates  the 
mainbeam,  a  squinting  or  splitting  of  the  adaptive 
beampattem  occurs  coincident  with  the  introduction  of  a 
mainlobe  null.  This  squinting  is  the  result  of  the 
beamformer’ s  attempt  to  maximize  SINR  by  trading  off 
interference  suppression  against  excess  white  noise  gain 
in  the  vicinity  of  the  steering  direction. 

3.2  Adaptivity/Robustness  Tradeoff 

Figure  4  illustrated  how  an  elevation  in  WNG  occurs  in 
response  to  a  mainlobe  interferer.  We  may  conclude  that 
WNG  is  a  measure  of  the  mainbeam  adaptive  nulling 
being  performed  by  the  ABF.  It  is  important  to 
understand  that  the  ABF  algorithm  is  unable  to 
distinguish  between  most  forms  of  signal  model 
mismatch  and  a  mainlobe  interferer.  Thus,  the  ABF  will 
interpret  steering  vector  mismatch  as  mainlobe 
interference  and  attempt  to  cancel  it  as  well.  Some  degree 
of  steering  vector  mismatch  is  unavoidable  in  real  towed 
array  data  applications.  Common  sources  of  mismatch 
include  manifold  uncertainty,  sensor  calibration  error, 
and  unmodeled  multipath  propagation.  The  beamformer 
signal  model  is  based  on  an  assumption  of  a  perfect  plane 
wave  with  known  sensor  gain  and  known  relative  sensor 
location.  As  the  ABF  algorithm  will  attempt  to  null  any 
data  component  that  deviates  from  these  assumptions, 
self-nulling  due  to  steering  vector  mismatch  is  a  major 
concern.  By  imposing  a  constraint  on  the  maximum 
allowable  WNG  of  the  adaptive  beamformer,  robustness 
to  mismatch  induced  nulling  may  be  introduced. 

Analyses  of  towed  array  data  have  shown  that  to  effect  a  useful 
level  of  strum  rejection  using  the  beamspace  ABF  algorithm,  a 
fairly  aggressive  adaptation  strategy  is  required.  By  contrast, 
signal  protection  against  self-nulling  in  the  cable  strum  band 


requires  a  very  conservative  adaptation  approach.  In  this  work, 
it  was  empirically  determined  that  a  WNGC  of  6  dB,  or  a 
maximum  allowable  WNG  of  4x  that  of  the  CBF  beamformer, 
represents  the  best  compromise  between  mainlobe  cable  strum 
nulling  and  signal  preservation  in  the  presence  of  mismatch. 

3.3  Adaptive  Weight  Power  Scaling 

The  white  noise  gain  constraint  (WNGC)  employed  in 
the  beamspace  ABF  architecture  is  based  on  the  scaled 
projection  technique  first  proposed  by  Cox  et  al  [4]. 

The  scaled  projection  WNGC  implementation  is 
composed  of  two  essential  parts.  First,  the  MVDR  weight 
vector  is  decomposed  into  two  orthogonal  components, 
non-adaptive  and  adaptive  components  respectively, 
using  the  following  beamspace  projection  operators: 


V  V 

Second,  upon  a  WNG  threshold  exceedance,  the  adaptive 
component  thus  isolated  is  scaled  such  that  the  WNGC  at 
the  beamformer  output  is  met  exactly. 

The  orthogonal  decomposition  prior  to  adaptive  weight 
scaling  is  important.  This  step  guarantees  that  the  weight 
scaling  will  be  applied  only  to  the  adaptive,  or  data- 
dependent,  component  of  the  ABF  weight  vector.  This 
insures  that  the  scaling  process  does  not  modify,  scale,  or 
rotate  the  beamformer  response  to  a  signal  that  is 
perfectly  matched  to  the  steering  vector.  Consequently, 
the  scaling  preserves  the  constraint  of  distortionless 
response  in  the  steering  direction.  The  adaptive 
component  of  the  MVDR  weight  vector  is  given  by: 

wa  =PaW. 

It  is  straightforward  to  verify  that  the  non-adaptive  and 
adaptive  components  derived  in  this  way  are  indeed 
orthogonal.  The  scaled  output  weight  vector  is  then  given 

by: 

”o=*na+Kwa’ 

where  the  scalar,  k9  represents  the  scaling  coefficient  We 
then  specifiy  the  WNGC  at  the  output  of  the  beamspace 
ABF  processor  in  terms  of  a  multiplier  on  the  non- 
adaptive  WNG, 

w*thiw0  <aw”jHmna. 

Here,  T  represents  the  7-dimensional  transformation 
from  element  space  to  adaptive  beamspace.  For  a  6  dB 
WNGC  the  multiplier,  oc,  is  equal  to  4.  Solving  the 
constraint  equation  results  in  a  quadratic  on  the  scaling 
coefficient,  k  [5].  The  result  is  two  solutions  for  k  which 
meet  the  constraint  exactly.  We  choose  the  value  which 
minimizes  the  output  power  of  the  ABF.  This  procedure 
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is  carried  out  at  each  processing  epoch  and  for  each 
frequency  bin  independently.  A  geometric  interpretation 
of  the  weight  scaling  procedure  is  shown  in  Figure  5. 

4.  TOWED  ARRAY  DATA  RESULTS 

Figure  6  depicts  frequency-azimuth  (FRAZ)  plots  for  a 
typical  time  epoch  for  each  of  four  different  processors: 
a)  the  CBF  beamformer,  2)  the  conservative  baseline 
ABF  beamformer,  3)  the  aggressive  6  dB  WNGC  ABF 
optimized  for  strum  rejection,  and  4)  unconstrained 
MVDR.  The  baseline  ABF  represents  the  WNGC  as 
implemented  in  the  present  towed  array  processing 
system.  While  the  details  of  this  WNGC  implementation 
are  not  presented  here,  the  basic  design  philosophy  of 
this  ABF  algorithm  is  to  emphasize  robustness  to 
mismatch  effects.  Upon  exceeding  the  WNG  threshold, 
set  in  the  vicinity  of  2  dB,  the  baseline  ABF  scales  the 
adaptive  weight  vector  back  to  the  non-adaptive  or  CBF 
weight  vector.  This  severely  constrains  the  ability  of  the 
baseline  ABF  to  effectively  null  any  strong  mainlobe 
interference  such  as  cable  strum. 

In  Figure  6,  the  presence  of  a  strong  interference  source  with 
multiple  sidelobes  is  observed  near  broadside  in  the  CBF 
FRAZ  display.  As  expected  all  of  the  ABF  approaches, 
conservative  and  aggressive  alike,  demonstrate  the  capacity  to 
null  such  a  strong  discrete  sidelobe  interference  source.  This 
result  thus  serves  as  a  useful  consistency  check  of  algorithm 
implementation. 

Next,  we  direct  our  attention  to  the  cable  strum 
interference  near  forward  sector,  i.e.  near  cosine  equal  to 
1.  In  the  normalized  frequency  band/*  =  0-0.3,  cable 
strum  is  observed  to  extend  over  a  wide  sector  of  cosine 
space  from  forward  endfire  to  near  broadside.  The 
important  differences  between  the  conservative  and 
aggressive  ABF  approaches  are  apparent  from  the  cable 
strum  rejection  performance  in  this  frequency  band.  The 
conservative  ABF  algorithm  does  very  little  to  reduce  the 
amplitude  of  the  strum  interference  in  forward  endfire. 
The  bearing  extent  of  the  strum  is  reduced  slightly.  With 
its  6  dB  WNGC,  the  bearing  extent  and  amplitude  of  the 
cable  strum  is  significantly  curtailed  relative  to  that  of  the 
conservative  baseline  ABF  algorithm. 

Figure  7  shows  raw  power  spectrum  density  plots  to 
further  illustrate  the  performance  improvement  realized 
with  increasingly  aggressive  adaptation.  Notice  that  the 
cable  strum  ABF  achieves  as  much  as  a  15  dB  local 
suppression  of  the  strum-dominated  noise  floor  in  the 
normalized  frequency  band  f  -  0.1 -0.3.  The  resulting 
noise  floor  suppression  uncovers  the  presence  of  a 
narrowband  feature  at  /  =  0.2  that  was  otherwise 


undetectable  in  the  CBF  and  baseline  ABF 
configurations. 

Figure  8  shows  the  measured  WNG  plots  corresponding 
to  the  power  spectrum  density  plots  of  Figure  7.  The 
measured  WNG  illustrates  the  relationship  between 
WNGC  and  strum  rejection.  It  is  clear  that  at  a  WNGC  of 
6  dB  most  of  the  strum  noise  floor  suppression 
performance  is  realized.  Recall  that  the  point  here  is  to 
allow  the  ABF  algorithm  to  adapt  only  as  much  as 
necessary  to  effect  useful  cable  strum  noise  suppression. 

5.  CONCLUSIONS 

Mechanically  induced  towed  array  self-noise  limits 
detection  performance  in  passive  sonar  systems, 
particularly  at  forward  endfire.  In  this  work,  a  beamspace 
adaptive  beamforming  architecture  for  the  rejection  of 
strong  mainlobe  cable  strum  rejection  in  forward  endfire 
was  presented.  The  approach  focused  on  the  choice  of  a 
white  noise  gain  constraint  which  achieved  a  suitable 
balance  between  aggressive  adapation  for  effective  strum 
nulling  and  conservative  adaptation  for  robustness  to 
mismatch-induced  self-nulling.  A  WNGC  of  6  dB 
relative  to  the  WNG  for  the  non-adaptive  steering  vector 
was  empirically  determined  to  offer  the  best  balance.  The 
WNGC  implementation  was  based  on  the  scaled 
projection  technique  first  presented  by  Cox  et  al.  [4]. 
Significant  cable  strum  suppression  performance  was 
shown  to  be  possible,,  on  the  order  of  15  dB  locally 
within  the  strum  interference  band. 
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